Laboratory for Remote Sensing Prof. KS Cheng

Hydrology and Spatial Modeling (RSLAB)

 

Professor Ke-Sheng Cheng

Dept of Bioenvironmental Systems Engineering &
Master Program in Statistics

National Taiwan University

Email: rslab@ntu.edu.tw

RSLAB_BSE_NTU
No. 1, Section 4, Roosevelt Road
Bioenvironmental Syst. Eng., National Taiwan University

  • Home
  • Prof. K.S. Cheng & Photo Gallery
  • CoursesClick to open the Courses menu
    • Applied Hydrology
    • Data Computation, Analysis and Visualization Using R
    • 2022 Hydrologic Frequency Analysis
    • Remote Sensing
    • Stochastic Hydroclimatic Modeling & Simulation
    • Stochastic Hydrology
  • ISEWR SeriesClick to open the ISEWR Series menu
    • ISEWR 2020
    • ISEWR 2019
  • Research Work and Publications
  • Contact Us

Data Computation, Analysis, and Visualization Using R   

(R語言應用於資料分析計算與視覺化)

Department of Bioenvironmental Systems Engineering/Master Program in Statistics

Prof. Ke-Sheng Cheng

生物環境系統工程學系/統計碩士學位學程   鄭克聲教授

 

 課程公告

 



Objective

The skills of data analysis and presentation have become essential for students of all disciplines. Data analysis requires knowledge of statistics and computer program coding. With the advent of statistical software, statistical data analysis and presentation can be conducted quickly and efficiently. R, a programming language and free software environment for statistical computing and graphics, has become one of the most commonly used computing languages in the statistics community. Thus, the objectives of this course are

  1. introducing the fundamentals and graphics of R, and
  2. introducing the basic to intermediate level of techniques for statistical data analysis and graphic presentation using R as the programming tool and platform.

Target students

This course is intended for students who have no or little experience of R. The focus of this course is to facilitate students with skills of data analysis using R while learning fundamental concepts of statistical methods.

Class format

This class will be conducted in an interactive format through the following arrangements:

  1. Subject and statistical methods (SSM) to be covered will be fully explained at the beginning of the semester (the first three or four weeks). For each SSM, problems and tasks with specific learning objectives (important concepts and theories) will be used to guide students to solve these problems through computer coding (using R) and in-class discussions.
  2. Students will be grouped (based on their backgrounds or interests) into several groups. Each group will be assigned certain SSMs to study during the semester.
  3. Every week, two or more groups (depending on the number of groups in the class) will present their progress and results in class, followed by discussions.
  4. Each group is expected to make several presentations during the semester. Hopefully, once in every 3 to 4 weeks.
  5. A final presentation is required for all groups in the final week.

SSMs to be covered in this semester will be dependent on the enrolled students. The following SSMs of previous years are only for your reference.

Evaluation

Students will be evaluated based on their performance in the progress report and presentations, as well as their participations in class discussions. 

DCAV_R gropus of 2021 fall semester (updated on 10/11/2021)

Weekly schedule of individual groups

  • R - Introduction & Graphics

    09-18-2019

    Introduction to R  

    Fundamental Graphics in R  

    LULC_Example.RData  

    3dplot.R  

  • SSM1  Rejection method for random number generation   

    Stochastic simulation/Monte Carlo simulation

    1. Pseudo random number generation
    2. Probability Integral Transformation
    3. Random number generation using R
    4. The acceptance/rejection method 
    5. Assessing the efficiency of random number generation

    PPT of the first report by Group 1

     

  • SSM2  Supervised classification - the multivariate Gaussian maximum likelihood classifier

    1. Simulation of 2-class, 2-feature Gaussian maximum likelihood classification.
    2. Confusion matrix
    3. Uncertainty assessment of classification accuracy
    4. Stochastic simulation for the performance evaluation of the supervised classification (PDF1, PDF2)

    PPT of the first report by Group 2


  • SSM3  Asymptotic distribution of the test statistic of the Kolmogorov-Smirnov test

    2021-10-06

    1. Techniques for goodness of fit test
    2. Chi-squared test
    3. Kolmogorov-Smirnov GOF test
    4. Test statistic Dn
    5. KS goodness of fit test in R
    6. Interpretation of the probability distribution of Dn

    Tables of the critical values at 0.05 and 0.01 level of significance of the test statistic Dn can be found in most statistics books. In this SSM assignment, we aim to derive the same critical values by stochastic simulation. 

    PPT of the first report by Group 3


  • SSM4  Drought index (SPI) calculation and spatiotemporal visualization

    Standardized Precipitation Index (SPI) is a measure of drought. You will learn how to calculate SPI using daily rainfalls of different rainfall stations and use the results to evaluate the spatiotemporal variation of drought occurrences.

    Data to be used:

    1. Daily rainfall data (04/01/1995 - 03/31/2007) at 50 rainfall stations  
    2. Location (latitude, longitude) and station ID of 50 rainfall stations  

    Expected results

    1. For SPI calculation, use 1 ten-day-period (TDP) as the operation scale and the SPI should has a time resolution of 3 TDPs. 
    2. Show the spatiotemporal variation of SPI values  [Example in Myanmar, PPT-A; PPT-B ;  Example in Taiwan, PPT-C]

    PPT of the first report by Group 4


  • SSM5  Stochastic simulation of a bivariate gamma distribution

    1. Bivariate Gaussian distribution
    2. Stochastic simulation of a bivariate Gaussian distribution
    3. Frequency factor of the gamma (Pearson type III) distribution
    4. Bivariate gamma distribution
    5. Correlation coefficient transformation
    6. Stochastic simulation of a bivariate gamma distribution

    SERRA article 

    PPT of the first report by Group 5


  • SSM6  Rainfall frequency analysis using annual maximum series (AMS) and event maximum series (EMS) - Mixture distribution modeling of the annual maximum rainfalls (PPT)

    1. The annual maximum series (AMS)
    2. The Extremal Type Theorem (ETT) and the GEV distribution
    3. The event maximum series (EMS)
    4. Mixture distribution modeling of the AMS
    5. Evaluation by stochastic simulation
  • SSM7  L-moment-ratio diagram (LMRD) for GOF test

    Establishing acceptance regions for L-moments-based goodness-of-fit tests by stochastic simulation.  Journal of Hydrology, Vol. 355, No.1-4, 49-62. (doi:10.1016/j.jhydrol.2008.02.023).

    1. Moment-ratio diagram and its usage in goodness-of-fit test
    2. L-moments (linear moments)
    3. L-moments of the the Gaussian and Gumbel distributions
    4. L-moment-ratio diagram (LMRD)
    5. LMRD of the Gaussian and Gumbel distributions
    6. Acceptance regions of LMRD-based GOF tests


  • SSM8 Variable-scale standardized precipitation index (VS-SPI) [PPT1, PPT2, PPT3, PPT4]

    1. Disadvantages of the fixed-scale SPI.
    2. Drought detection and monitoring using a variable-scale SPI.
    3. Regional VS-SPI
    4. A reference website: https://teng5.shinyapps.io/SPI_region/ 
  • SSM9  Gamma random field simulation

    1. Sequential Gaussian simulation (SGS)
    2. Gamma random field simulation
    3. Potential applications

    SERRA article

  • SSMx  Model performance evaluation - Assessing the uncertainties in real-time forecasting

    1. Model performance evaluation criteria
    2. NSE (Coefficient of Efficiency, CE)
    3. Coefficient of Persistence (CP)
    4. Sample-dependent CE-CP relationship 
    5. Model-dependent CE-CP relationship

    SERRA article 

  • SSMx  Change detection using the Mann-Whitney-Pettitt (MWP) test

    A Non-parametric Approach to the Change-point Problem (A.N. Pettitt, Journal of the Royal Statistical Society. Series C, 1979)

  • SSMx  Rainfall-Runoff Modeling    

    Animation of rainfall-unit hydrograph-runoff simulation by Bo-Yu Chen.   

     1-hr unit hydrograph UH(1,t)  of Wu-Duh flow station and hourly rainfalls of two storm events  

  • SSMx  IDF Uncertainty - Bootstrap sampling

    Hourly rainfall data of two rainfall stations in northern Taiwan. (Hourly_Rainfall_Data.zip)

    1. Extract annual maximum rainfalls of various durations (1, 2, 3, 6, 12, 24, 48 hours). These are known as the annual maximum series (AMS).
    2. Conduct goodness-of-fit test to choose the best probability distribution for rainfall frequency analysis.
    3. Determine distribution parameters by using the method of moments and method of L-moments.
    4. For a specific duration, calculate the design rainfall depths of 5, 10, 25, 50, 100 and 200-year return periods. 
    5. Plot the Duration-Depth-Frequency (return period) curve and Intensity-Depth-Frequency (IDF) curve.
    6. Evaluate the results
    7. Investigate uncertainty of the IDF curve by bootstrapping from the annual maximum series.

RSLAB - NTU

Prof. Ke-Sheng Cheng 


RSLAB_BSE_NTU
No. 1, Section 4, Roosevelt Road
Bioenvironmental Syst. Eng., National Taiwan University