Dataset statistics
Number of variables | 1 |
---|---|
Number of observations | 59079 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 31785 |
Duplicate rows (%) | 53.8% |
Total size in memory | 5.2 MiB |
Average record size in memory | 91.7 B |
Variable types
CAT | 1 |
---|
Reproduction
Analysis started | 2020-06-04 22:28:50.050734 |
---|---|
Analysis finished | 2020-06-04 22:28:51.972652 |
Duration | 1.92 second |
Version | pandas-profiling v2.8.0 |
Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
Download configuration | config.yaml |
Distinct count | 27294 |
---|---|
Unique (%) | 46.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 461.7 KiB |
<td>0</td> | 4631 |
---|---|
<td>2020-06-03 02:33:13</td> | 3641 |
</tr> | 3641 |
<td>US</td> | 3036 |
<td></td> | 1668 |
Other values (27289) |
Value | Count | Frequency (%) | |
<td>0</td> | 4631 | 7.8% | |
<td>2020-06-03 02:33:13</td> | 3641 | 6.2% | |
</tr> | 3641 | 6.2% | |
<td>US</td> | 3036 | 5.1% | |
<td></td> | 1668 | 2.8% | |
<td>0.0</td> | 1329 | 2.2% | |
<td>1</td> | 734 | 1.2% | |
<td>2</td> | 427 | 0.7% | |
<td>3</td> | 343 | 0.6% | |
<td>4</td> | 257 | 0.4% | |
<td>Texas</td> | 235 | 0.4% | |
<td>5</td> | 204 | 0.3% | |
<td>7</td> | 185 | 0.3% | |
<td>6</td> | 182 | 0.3% | |
<td>Georgia</td> | 163 | 0.3% | |
<td>8</td> | 153 | 0.3% | |
<td>9</td> | 144 | 0.2% | |
<td>Virginia</td> | 133 | 0.2% | |
<td>12</td> | 126 | 0.2% | |
<td>Kentucky</td> | 120 | 0.2% | |
<td>11</td> | 118 | 0.2% | |
<td>10</td> | 116 | 0.2% | |
<td>13</td> | 103 | 0.2% | |
<td>Missouri</td> | 103 | 0.2% | |
<td>Illinois</td> | 103 | 0.2% | |
Other values (27269) | 37184 | 62.9% |
Length
Max length | 765 |
---|---|
Median length | 29 |
Mean length | 34.66646355 |
Min length | 1 |
Most occurring characters
Value | Count | Frequency (%) | |
815590 | 39.8% | ||
t | 117954 | 5.8% | |
d | 115593 | 5.6% | |
< | 110410 | 5.4% | |
> | 110406 | 5.4% | |
/ | 55609 | 2.7% | |
0 | 48383 | 2.4% | |
3 | 42375 | 2.1% | |
2 | 39159 | 1.9% | |
- | 38905 | 1.9% | |
" | 38518 | 1.9% | |
1 | 36530 | 1.8% | |
l | 29696 | 1.4% | |
e | 28909 | 1.4% | |
i | 28676 | 1.4% | |
n | 28344 | 1.4% | |
s | 27777 | 1.4% | |
6 | 25157 | 1.2% | |
a | 24029 | 1.2% | |
4 | 23839 | 1.2% | |
5 | 23050 | 1.1% | |
7 | 22292 | 1.1% | |
9 | 21983 | 1.1% | |
8 | 21784 | 1.1% | |
r | 19398 | 0.9% | |
Other values (64) | 153694 | 7.5% |
Most occurring categories
Value | Count | Frequency (%) | |
Space Separator | 815590 | 39.8% | |
Lowercase Letter | 503963 | 24.6% | |
Decimal Number | 304552 | 14.9% | |
Math Symbol | 240134 | 11.7% | |
Other Punctuation | 117294 | 5.7% | |
Dash Punctuation | 38905 | 1.9% | |
Uppercase Letter | 27407 | 1.3% | |
Connector Punctuation | 188 | < 0.1% | |
Open Punctuation | 9 | < 0.1% | |
Close Punctuation | 9 | < 0.1% | |
Other Symbol | 6 | < 0.1% | |
Modifier Symbol | 2 | < 0.1% | |
Final Punctuation | 1 | < 0.1% |
Most frequent Math Symbol characters
Value | Count | Frequency (%) | |
< | 110410 | 46.0% | |
> | 110406 | 46.0% | |
= | 19297 | 8.0% | |
+ | 21 | < 0.1% |
Most frequent Lowercase Letter characters
Value | Count | Frequency (%) | |
t | 117954 | 23.4% | |
d | 115593 | 22.9% | |
l | 29696 | 5.9% | |
e | 28909 | 5.7% | |
i | 28676 | 5.7% | |
n | 28344 | 5.6% | |
s | 27777 | 5.5% | |
a | 24029 | 4.8% | |
r | 19398 | 3.8% | |
b | 15684 | 3.1% | |
u | 13005 | 2.6% | |
m | 12614 | 2.5% | |
c | 9746 | 1.9% | |
o | 9744 | 1.9% | |
j | 7616 | 1.5% | |
f | 4474 | 0.9% | |
h | 2244 | 0.4% | |
g | 1740 | 0.3% | |
p | 1608 | 0.3% | |
k | 1226 | 0.2% | |
y | 1054 | 0.2% | |
v | 995 | 0.2% | |
w | 762 | 0.2% | |
x | 756 | 0.2% | |
z | 255 | 0.1% |
Most frequent Space Separator characters
Value | Count | Frequency (%) | |
815590 | 100.0% |
Most frequent Other Punctuation characters
Value | Count | Frequency (%) | |
/ | 55609 | 47.4% | |
" | 38518 | 32.8% | |
. | 15701 | 13.4% | |
: | 7351 | 6.3% | |
; | 30 | < 0.1% | |
& | 25 | < 0.1% | |
% | 25 | < 0.1% | |
# | 13 | < 0.1% | |
? | 6 | < 0.1% | |
! | 5 | < 0.1% | |
* | 4 | < 0.1% | |
· | 2 | < 0.1% | |
' | 2 | < 0.1% | |
… | 2 | < 0.1% | |
@ | 1 | < 0.1% |
Most frequent Dash Punctuation characters
Value | Count | Frequency (%) | |
- | 38905 | 100.0% |
Most frequent Decimal Number characters
Value | Count | Frequency (%) | |
0 | 48383 | 15.9% | |
3 | 42375 | 13.9% | |
2 | 39159 | 12.9% | |
1 | 36530 | 12.0% | |
6 | 25157 | 8.3% | |
4 | 23839 | 7.8% | |
5 | 23050 | 7.6% | |
7 | 22292 | 7.3% | |
9 | 21983 | 7.2% | |
8 | 21784 | 7.2% |
Most frequent Uppercase Letter characters
Value | Count | Frequency (%) | |
L | 7700 | 28.1% | |
C | 4692 | 17.1% | |
S | 3793 | 13.8% | |
U | 3212 | 11.7% | |
M | 980 | 3.6% | |
I | 617 | 2.3% | |
N | 526 | 1.9% | |
T | 511 | 1.9% | |
G | 494 | 1.8% | |
O | 463 | 1.7% | |
W | 433 | 1.6% | |
A | 428 | 1.6% | |
D | 427 | 1.6% | |
V | 426 | 1.6% | |
B | 424 | 1.5% | |
K | 376 | 1.4% | |
H | 372 | 1.4% | |
P | 365 | 1.3% | |
R | 289 | 1.1% | |
F | 260 | 0.9% | |
J | 217 | 0.8% | |
E | 180 | 0.7% | |
Y | 116 | 0.4% | |
Z | 46 | 0.2% | |
Q | 32 | 0.1% |
Most frequent Connector Punctuation characters
Value | Count | Frequency (%) | |
_ | 188 | 100.0% |
Most frequent Modifier Symbol characters
Value | Count | Frequency (%) | |
` | 2 | 100.0% |
Most frequent Other Symbol characters
Value | Count | Frequency (%) | |
↵ | 6 | 100.0% |
Most frequent Open Punctuation characters
Value | Count | Frequency (%) | |
( | 9 | 100.0% |
Most frequent Close Punctuation characters
Value | Count | Frequency (%) | |
) | 9 | 100.0% |
Most frequent Final Punctuation characters
Value | Count | Frequency (%) | |
’ | 1 | 100.0% |
Most occurring scripts
Value | Count | Frequency (%) | |
Common | 1516690 | 74.1% | |
Latin | 531370 | 25.9% |
Most frequent Common characters
Value | Count | Frequency (%) | |
815590 | 53.8% | ||
< | 110410 | 7.3% | |
> | 110406 | 7.3% | |
/ | 55609 | 3.7% | |
0 | 48383 | 3.2% | |
3 | 42375 | 2.8% | |
2 | 39159 | 2.6% | |
- | 38905 | 2.6% | |
" | 38518 | 2.5% | |
1 | 36530 | 2.4% | |
6 | 25157 | 1.7% | |
4 | 23839 | 1.6% | |
5 | 23050 | 1.5% | |
7 | 22292 | 1.5% | |
9 | 21983 | 1.4% | |
8 | 21784 | 1.4% | |
= | 19297 | 1.3% | |
. | 15701 | 1.0% | |
: | 7351 | 0.5% | |
_ | 188 | < 0.1% | |
; | 30 | < 0.1% | |
& | 25 | < 0.1% | |
% | 25 | < 0.1% | |
+ | 21 | < 0.1% | |
# | 13 | < 0.1% | |
Other values (12) | 49 | < 0.1% |
Most frequent Latin characters
Value | Count | Frequency (%) | |
t | 117954 | 22.2% | |
d | 115593 | 21.8% | |
l | 29696 | 5.6% | |
e | 28909 | 5.4% | |
i | 28676 | 5.4% | |
n | 28344 | 5.3% | |
s | 27777 | 5.2% | |
a | 24029 | 4.5% | |
r | 19398 | 3.7% | |
b | 15684 | 3.0% | |
u | 13005 | 2.4% | |
m | 12614 | 2.4% | |
c | 9746 | 1.8% | |
o | 9744 | 1.8% | |
L | 7700 | 1.4% | |
j | 7616 | 1.4% | |
C | 4692 | 0.9% | |
f | 4474 | 0.8% | |
S | 3793 | 0.7% | |
U | 3212 | 0.6% | |
h | 2244 | 0.4% | |
g | 1740 | 0.3% | |
p | 1608 | 0.3% | |
k | 1226 | 0.2% | |
y | 1054 | 0.2% | |
Other values (27) | 10842 | 2.0% |
Most occurring blocks
Value | Count | Frequency (%) | |
ASCII | 2048049 | > 99.9% | |
Arrows | 6 | < 0.1% | |
Punctuation | 3 | < 0.1% | |
None | 2 | < 0.1% |
Most frequent ASCII characters
Value | Count | Frequency (%) | |
815590 | 39.8% | ||
t | 117954 | 5.8% | |
d | 115593 | 5.6% | |
< | 110410 | 5.4% | |
> | 110406 | 5.4% | |
/ | 55609 | 2.7% | |
0 | 48383 | 2.4% | |
3 | 42375 | 2.1% | |
2 | 39159 | 1.9% | |
- | 38905 | 1.9% | |
" | 38518 | 1.9% | |
1 | 36530 | 1.8% | |
l | 29696 | 1.4% | |
e | 28909 | 1.4% | |
i | 28676 | 1.4% | |
n | 28344 | 1.4% | |
s | 27777 | 1.4% | |
6 | 25157 | 1.2% | |
a | 24029 | 1.2% | |
4 | 23839 | 1.2% | |
5 | 23050 | 1.1% | |
7 | 22292 | 1.1% | |
9 | 21983 | 1.1% | |
8 | 21784 | 1.1% | |
r | 19398 | 0.9% | |
Other values (60) | 153683 | 7.5% |
Most frequent None characters
Value | Count | Frequency (%) | |
· | 2 | 100.0% |
Most frequent Arrows characters
Value | Count | Frequency (%) | |
↵ | 6 | 100.0% |
Most frequent Punctuation characters
Value | Count | Frequency (%) | |
… | 2 | 66.7% | |
’ | 1 | 33.3% |
First rows
<!DOCTYPE html> | |
---|---|
0 | <html lang="en"> |
1 | <head> |
2 | <meta charset="utf-8"> |
3 | <link rel="dns-prefetch" href="https://github.githubassets.com"> |
4 | <link rel="dns-prefetch" href="https://avatars0.githubusercontent.com"> |
5 | <link rel="dns-prefetch" href="https://avatars1.githubusercontent.com"> |
6 | <link rel="dns-prefetch" href="https://avatars2.githubusercontent.com"> |
7 | <link rel="dns-prefetch" href="https://avatars3.githubusercontent.com"> |
8 | <link rel="dns-prefetch" href="https://github-cloud.s3.amazonaws.com"> |
9 | <link rel="dns-prefetch" href="https://user-images.githubusercontent.com/"> |
Last rows
<!DOCTYPE html> | |
---|---|
59069 | <div class="octocat-spinner my-6 js-details-dialog-spinner"></div> |
59070 | </details-dialog> |
59071 | </details> |
59072 | </template> |
59073 | <div class="Popover js-hovercard-content position-absolute" style="display: none; outline: none;" tabindex="0"> |
59074 | <div class="Popover-message Popover-message--bottom-left Popover-message--large Box box-shadow-large" style="width:360px;"> |
59075 | </div> |
59076 | </div> |
59077 | </body> |
59078 | </html> |
Most frequent
<!DOCTYPE html> | count | |
---|---|---|
46 | <td>0</td> | 4631 |
399 | <td>2020-06-03 02:33:13</td> | 3641 |
1688 | </tr> | 3641 |
1623 | <td>US</td> | 3036 |
1028 | <td></td> | 1668 |
25 | <td>0.0</td> | 1329 |
360 | <td>1</td> | 734 |
526 | <td>2</td> | 427 |
643 | <td>3</td> | 343 |
738 | <td>4</td> | 257 |