Overview

Dataset statistics

Number of variables1
Number of observations13880
Missing cells1236
Missing cells (%)8.9%
Duplicate rows902
Duplicate rows (%)6.5%
Total size in memory216.9 KiB
Average record size in memory16.0 B

Variable types

TimeSeries1

Timeseries statistics

Number of series1
Time series length13880
Starting point1983-01-01 00:00:00
Ending point2020-12-31 00:00:00
Period1 day
2024-05-12T15:34:49.271719image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-05-12T15:34:49.719666image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Alerts

Dataset has 902 (6.5%) duplicate rowsDuplicates
Flow has 1236 (8.9%) missing valuesMissing

Reproduction

Analysis started2024-05-12 19:34:46.215446
Analysis finished2024-05-12 19:34:49.024520
Duration2.81 seconds
MissingQ_Station_NA_25027050_ok_Missing.csv
Download configurationconfig.json

Variables

Flow
Numeric time series

MISSING 

Distinct3522
Distinct (%)27.9%
Missing1236
Missing (%)8.9%
Infinite0
Infinite (%)0.0%
Mean0.05962591
Minimum-1533
Maximum1416
Zeros89
Zeros (%)0.6%
Memory size216.9 KiB
2024-05-12T15:34:50.461102image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-1533
5-th percentile-269.34
Q1-66.5475
median5.84
Q373
95-th percentile248
Maximum1416
Range2949
Interquartile range (IQR)139.5475

Descriptive statistics

Standard deviation169.34023
Coefficient of variation (CV)2840.0445
Kurtosis8.4972995
Mean0.05962591
Median Absolute Deviation (MAD)69.84
Skewness-0.22764574
Sum753.91
Variance28676.115
MonotonicityNot monotonic
Augmented Dickey-Fuller test p-value0
2024-05-12T15:34:51.075157image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
2024-05-12T15:34:52.355080image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Gap statistics

number of gaps27
min4 days
max1 year and 6 days
mean6 weeks, 4 days and 13 hours
std13 weeks, 5 days and 1 hour
2024-05-12T15:34:52.921137image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 89
 
0.6%
10 58
 
0.4%
8 56
 
0.4%
-7 52
 
0.4%
4 52
 
0.4%
2 52
 
0.4%
39 50
 
0.4%
-4 50
 
0.4%
1 50
 
0.4%
9 50
 
0.4%
Other values (3512) 12085
87.1%
(Missing) 1236
 
8.9%
ValueCountFrequency (%)
-1533 1
< 0.1%
-1452 1
< 0.1%
-1435 1
< 0.1%
-1425 1
< 0.1%
-1201 1
< 0.1%
-1194 1
< 0.1%
-1114 1
< 0.1%
-1097 2
< 0.1%
-1082 1
< 0.1%
-1072 1
< 0.1%
ValueCountFrequency (%)
1416 1
< 0.1%
1391 1
< 0.1%
1352 1
< 0.1%
1319 1
< 0.1%
1276 1
< 0.1%
1214 1
< 0.1%
1158 1
< 0.1%
1145 1
< 0.1%
1073 1
< 0.1%
1007 1
< 0.1%
2024-05-12T15:34:51.614266image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ACF and PACF

Interactions

2024-05-12T15:34:48.382921image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-05-12T15:34:48.740073image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-12T15:34:48.924017image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Flow
Date
1983-01-01NaN
1983-01-02NaN
1983-01-03-44.0
1983-01-0447.0
1983-01-05-128.0
1983-01-0631.0
1983-01-074.0
1983-01-0824.0
1983-01-0950.0
1983-01-10-37.0
Flow
Date
2020-12-2225.20
2020-12-23-3.41
2020-12-2445.68
2020-12-2532.84
2020-12-26-63.76
2020-12-27350.10
2020-12-28NaN
2020-12-29NaN
2020-12-30NaN
2020-12-31NaN

Duplicate rows

Most frequently occurring

Flow# duplicates
901NaN1236
4450.089
47010.058
4678.056
424-7.052
4512.052
4564.052
432-4.050
4491.050
4689.050