Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells1365
Missing cells (%)9.8%
Duplicate rows795
Duplicate rows (%)5.7%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T15:35:11.536548image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T15:35:11.935607image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 795 (5.7%) duplicate rowsDuplicates
Flow has 1365 (9.8%) missing valuesMissing

Reproduction

Analysis started2024-05-12 19:35:09.650889
Analysis finished2024-05-12 19:35:11.427583
Duration1.78 second
MissingQ_Station_NA_25027640_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING 

Distinct2982
Distinct (%)23.8%
Missing1365
Missing (%)9.8%
Infinite0
Infinite (%)0.0%
Mean0.024003196
Minimum-4646
Maximum4308
Zeros117
Zeros (%)0.8%
Memory size216.9 KiB
2024-05-12T15:35:12.676945image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-4646
5-th percentile-237
Q1-53
median1
Q354
95-th percentile230.3
Maximum4308
Range8954
Interquartile range (IQR)107

Descriptive statistics

Standard deviation224.48974
Coefficient of variation (CV)9352.4936
Kurtosis67.491502
Mean0.024003196
Median Absolute Deviation (MAD)53
Skewness-0.29156623
Sum300.4
Variance50395.642
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value0
2024-05-12T15:35:13.305149image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T15:35:16.643692image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps19
min5 days
max1 year and 6 days
mean10 weeks, 2 days and 16 hours
std16 weeks, 2 days and 23 hours
2024-05-12T15:35:17.078763image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 117
 
0.8%
-2 86
 
0.6%
14 81
 
0.6%
3 81
 
0.6%
4 78
 
0.6%
-11 76
 
0.5%
-3 76
 
0.5%
-6 73
 
0.5%
16 72
 
0.5%
8 72
 
0.5%
Other values (2972) 11703
84.3%
(Missing) 1365
 
9.8%
ValueCountFrequency (%)
-4646 1
< 0.1%
-4030 1
< 0.1%
-2805 1
< 0.1%
-2613 1
< 0.1%
-2467 1
< 0.1%
-2459 1
< 0.1%
-2303 1
< 0.1%
-2178 1
< 0.1%
-2173 1
< 0.1%
-2121 1
< 0.1%
ValueCountFrequency (%)
4308 1
< 0.1%
2947 1
< 0.1%
2833 1
< 0.1%
2787 1
< 0.1%
2600 1
< 0.1%
2460 1
< 0.1%
2450 1
< 0.1%
2391.9 1
< 0.1%
2301 1
< 0.1%
2276 1
< 0.1%
2024-05-12T15:35:15.785994image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T15:35:10.844998image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T15:35:11.171451image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T15:35:11.350511image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-01NaN
1983-01-02NaN
1983-01-03NaN
1983-01-04205.0
1983-01-05-227.0
1983-01-06110.0
1983-01-0777.0
1983-01-08-120.0
1983-01-09174.0
1983-01-10-264.0
Flow
Date
2020-12-22110.4
2020-12-23-72.4
2020-12-2485.3
2020-12-25-101.2
2020-12-2693.0
2020-12-27-67.3
2020-12-28-7.2
2020-12-29-87.4
2020-12-30192.0
2020-12-31-91.2

Duplicate rows

Most frequently occurring

Flow# duplicates
794NaN1365
4040.0117
399-2.086
4123.081
43314.081
4144.078
378-11.076
397-3.076
390-6.073
4081.072