**Professor Ke-Sheng Cheng**

**Dept of Bioenvironmental Systems Engineering**** &**

**Email:** rslab@ntu.edu.tw

RSLAB_BSE_NTU

No. 1, Section 4, Roosevelt Road

Bioenvironmental Syst. Eng., National Taiwan University

*Department of Bioenvironmental Systems Engineering/Master Program in Statistics*

** Prof. Ke-Sheng Cheng**

**生物環境系統工程學系/統計碩士學位學程 鄭克聲教授**

** 課程公告**

****

**Objective**

The skills of data analysis and presentation have become essential for students of all disciplines. Data analysis requires knowledge of statistics and computer program coding. With the advent of statistical software, statistical data analysis and presentation can be conducted quickly and efficiently. R, a programming language and free software environment for statistical computing and graphics, has become one of the most commonly used computing languages in the statistics community. Thus, the objectives of this course are

- introducing the fundamentals and graphics of R, and
- introducing the basic to intermediate level of techniques for statistical data analysis and graphic presentation using R as the programming tool and platform.

**Target students**

This course is intended for students who have no or little experience of R. The focus of this course is to facilitate students with skills of data analysis using R while learning fundamental concepts of statistical methods.

**Class format**

This class will be conducted in an interactive format through the following arrangements:

- Subject and statistical methods (SSM) to be covered will be fully explained at the beginning of the semester (the first three or four weeks). For each SSM, problems and tasks with specific learning objectives (important concepts and theories) will be used to guide students to solve these problems through computer coding (using R) and in-class discussions.
- Students will be grouped (based on their backgrounds or interests) into several groups. Each group will be assigned certain SSMs to study during the semester.
- Every week, two or more groups (depending on the number of groups in the class) will present their progress and results in class, followed by discussions.
- Each group is expected to make several presentations during the semester. Hopefully, once in every 3 to 4 weeks.
- A final presentation is required for all groups in the final week.

**SSMs to be covered in this semester will be dependent on the enrolled students.** The following SSMs of previous years are only for your reference.

**Evaluation**

Students will be evaluated based on their performance in the progress report and presentations, as well as their participations in class discussions.

**DCAV_R gropus of 2021 fall semester (updated on 10/11/2021)**

**Weekly schedule of individual groups**

### R - Introduction & Graphics

**SSM1 Rejection method for random number generation***Stochastic simulation/Monte Carlo simulation*- Pseudo random number generation
- Probability Integral Transformation
- Random number generation using R
- The acceptance/rejection method
- Assessing the efficiency of random number generation

**PPT of the first report by Group 1**### SSM2 Supervised classification - the multivariate Gaussian maximum likelihood classifier

- Simulation of 2-class, 2-feature Gaussian maximum likelihood classification.
- Confusion matrix
- Uncertainty assessment of classification accuracy
- Stochastic simulation for the performance evaluation of the supervised classification (
**PDF1**,**PDF2**)

**PPT of the first report by Group 2**### SSM3 Asymptotic distribution of the test statistic of the Kolmogorov-Smirnov test

*2021-10-06*- Techniques for goodness of fit test
- Chi-squared test
- Kolmogorov-Smirnov GOF test
- Test statistic Dn
- KS goodness of fit test in R
- Interpretation of the probability distribution of Dn

**Tables of the critical values at 0.05 and 0.01 level of significance of the test statistic***Dn*can be found in most statistics books. In this SSM assignment, we aim to derive the same critical values by stochastic simulation.**PPT of the first report by Group 3**### SSM4 Drought index (SPI) calculation and spatiotemporal visualization

Standardized Precipitation Index (SPI) is a measure of drought. You will learn how to calculate SPI using daily rainfalls of different rainfall stations and use the results to evaluate the spatiotemporal variation of drought occurrences.

**Data to be used:****Daily rainfall data (04/01/1995 - 03/31/2007) at 50 rainfall stations****Location (latitude, longitude) and station ID of 50 rainfall stations**

**Expected results****For SPI calculation, use 1 ten-day-period (TDP) as the operation scale and the SPI should has a time resolution of 3 TDPs.****Show the spatiotemporal variation of SPI values [Example in Myanmar, PPT-A; PPT-B ; Example in Taiwan, PPT-C]**

**PPT of the first report by Group 4**### SSM5 Stochastic simulation of a bivariate gamma distribution

- Bivariate Gaussian distribution
- Stochastic simulation of a bivariate Gaussian distribution
- Frequency factor of the gamma (Pearson type III) distribution
- Bivariate gamma distribution
- Correlation coefficient transformation
- Stochastic simulation of a bivariate gamma distribution

**PPT of the first report by Group 5****SSM6 Rainfall frequency analysis using annual maximum series (AMS) and event maximum series (EMS) - Mixture distribution modeling of the annual maximum rainfalls (PPT)**- The annual maximum series (AMS)
- The Extremal Type Theorem (ETT) and the GEV distribution
- The event maximum series (EMS)
- Mixture distribution modeling of the AMS
- Evaluation by stochastic simulation

### SSM7

**L-moment-ratio diagram (LMRD) for GOF test**Establishing acceptance regions for L-moments-based goodness-of-fit tests by stochastic simulation.

*Journal of Hydrology*, Vol. 355, No.1-4, 49-62. (doi:10.1016/j.jhydrol.2008.02.023).- Moment-ratio diagram and its usage in goodness-of-fit test
- L-moments (linear moments)
- L-moments of the the Gaussian and Gumbel distributions
- L-moment-ratio diagram (LMRD)
- LMRD of the Gaussian and Gumbel distributions
- Acceptance regions of LMRD-based GOF tests

**SSM8 Variable-scale standardized precipitation index (VS-SPI) [PPT1, PPT2, PPT3, PPT4]**- Disadvantages of the fixed-scale SPI.
- Drought detection and monitoring using a variable-scale SPI.
- Regional VS-SPI
- A reference website: https://teng5.shinyapps.io/SPI_region/

### SSM9 Gamma random field simulation

- Sequential Gaussian simulation (SGS)
- Gamma random field simulation
- Potential applications

### SSMx Model performance evaluation - Assessing the uncertainties in real-time forecasting

- Model performance evaluation criteria
- NSE (Coefficient of Efficiency, CE)
- Coefficient of Persistence (CP)
- Sample-dependent CE-CP relationship
- Model-dependent CE-CP relationship

### SSMx Change detection using the Mann-Whitney-Pettitt (MWP) test

**A Non-parametric Approach to the Change-point Problem (A.N. Pettitt, Journal of the Royal Statistical Society. Series C, 1979)**### SSMx Rainfall-Runoff Modeling

Animation of rainfall-unit hydrograph-runoff simulation by Bo-Yu Chen.

**1-hr unit hydrograph UH(1,t) of Wu-Duh flow station and hourly rainfalls of two storm events**### SSMx IDF Uncertainty - Bootstrap sampling

**Hourly rainfall data of two rainfall stations in northern Taiwan**. (**Hourly_Rainfall_Data.zip**)- Extract annual maximum rainfalls of various durations (1, 2, 3, 6, 12, 24, 48 hours). These are known as the annual maximum series (AMS).
- Conduct goodness-of-fit test to choose the best probability distribution for rainfall frequency analysis.
- Determine distribution parameters by using the method of moments and method of L-moments.
- For a specific duration, calculate the design rainfall depths of 5, 10, 25, 50, 100 and 200-year return periods.
- Plot the Duration-Depth-Frequency (return period) curve and Intensity-Depth-Frequency (IDF) curve.
- Evaluate the results
- Investigate uncertainty of the IDF curve by bootstrapping from the annual maximum series.

**RSLAB - NTU**

**Prof. Ke-Sheng Cheng **

RSLAB_BSE_NTU

No. 1, Section 4, Roosevelt Road

Bioenvironmental Syst. Eng., National Taiwan University