Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells2353
Missing cells (%)17.0%
Duplicate rows1503
Duplicate rows (%)10.8%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T15:36:03.555375image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T15:36:03.967510image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 1503 (10.8%) duplicate rowsDuplicates
Flow has 2353 (17.0%) missing valuesMissing
Flow has 145 (1.0%) zerosZeros

Reproduction

Analysis started2024-05-12 19:36:01.130077
Analysis finished2024-05-12 19:36:03.452885
Duration2.32 seconds
MissingQ_Station_NA_25017010_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING  ZEROS 

Distinct9088
Distinct (%)78.8%
Missing2353
Missing (%)17.0%
Infinite0
Infinite (%)0.0%
Mean1.3485025
Minimum-1288.3
Maximum950.8
Zeros145
Zeros (%)1.0%
Memory size216.9 KiB
2024-05-12T15:36:04.691223image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-1288.3
5-th percentile-228.2
Q1-26.6
median1.9
Q346.45
95-th percentile197.57
Maximum950.8
Range2239.1
Interquartile range (IQR)73.05

Descriptive statistics

Standard deviation135.91002
Coefficient of variation (CV)100.78589
Kurtosis8.3526547
Mean1.3485025
Median Absolute Deviation (MAD)38.4
Skewness-0.88956971
Sum15544.188
Variance18471.534
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value1.00143545 × 10-28
2024-05-12T15:36:05.326029image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T15:36:06.830279image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps76
min4 days
max2 years and 6 days
mean4 weeks, 1 day and 3 hours
std14 weeks, 3 hours and 12 minutes
2024-05-12T15:36:07.234686image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 145
 
1.0%
-0.5 13
 
0.1%
-0.01 12
 
0.1%
1.5 12
 
0.1%
0.01 10
 
0.1%
0.7 10
 
0.1%
0.5 10
 
0.1%
0.4 10
 
0.1%
1 10
 
0.1%
0.8 9
 
0.1%
Other values (9078) 11286
81.3%
(Missing) 2353
 
17.0%
ValueCountFrequency (%)
-1288.3 1
< 0.1%
-1122.7 1
< 0.1%
-1054.6 1
< 0.1%
-1004.7 1
< 0.1%
-983.9 1
< 0.1%
-934.9 1
< 0.1%
-908.7 1
< 0.1%
-887.1 1
< 0.1%
-873.3 1
< 0.1%
-861.4 1
< 0.1%
ValueCountFrequency (%)
950.8 1
< 0.1%
767.5 1
< 0.1%
761.8 1
< 0.1%
744.3 1
< 0.1%
737.4 1
< 0.1%
719.8 1
< 0.1%
718.1 1
< 0.1%
714.6 1
< 0.1%
682.2 1
< 0.1%
667.2 1
< 0.1%
2024-05-12T15:36:06.157363image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T15:36:02.796125image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T15:36:03.148367image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T15:36:03.365207image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-01NaN
1983-01-02NaN
1983-01-03NaN
1983-01-04NaN
1983-01-05NaN
1983-01-06NaN
1983-01-07NaN
1983-01-08NaN
1983-01-09NaN
1983-01-10NaN
Flow
Date
2020-12-224.400
2020-12-23-5.600
2020-12-241.788
2020-12-254.562
2020-12-260.062
2020-12-278.738
2020-12-287.025
2020-12-29-25.075
2020-12-300.488
2020-12-311.899

Duplicate rows

Most frequently occurring

Flow# duplicates
1502NaN2353
5650.00145
519-0.5013
562-0.0112
6791.5012
5710.0110
6040.4010
6170.5010
6330.7010
6521.0010