APS Failure (UCI, Classification, n=76000, d=170, 2 classes)

Loading The Data

In [1]:
from kxy_datasets.uci_classifications import APSFailure # pip install kxy_datasets
In [2]:
dataset = APSFailure()
df = dataset.df # Retrieve the dataset as a pandas dataframe
y_column = dataset.y_column # The name of the column corresponding to the target
problem_type = dataset.problem_type # 'regression' or 'classification'
In [3]:
df.kxy.describe() # Visualize a summary of the data

--------------
Column: aa_000
--------------
Type:   Continuous
Max:    42,949,672
p75:    48,840
Mean:   61,159
Median: 30,813
p25:    860
Min:    0.0

--------------
Column: ab_000
--------------
Type:   Continuous
Max:    204
p75:    0.0
Mean:   0.7
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ac_000
--------------
Type:   Continuous
Max:    2,130,706,796
p75:    970
Mean:   356,439,779
Median: 154
p25:    16
Min:    0.0

--------------
Column: ad_000
--------------
Type:   Continuous
Max:    8,584,297,742
p75:    430
Mean:   150,629
Median: 128
p25:    24
Min:    0.0

--------------
Column: ae_000
--------------
Type:   Continuous
Max:    21,050
p75:    0.0
Mean:   6.7
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: af_000
--------------
Type:   Continuous
Max:    20,070
p75:    0.0
Mean:   10
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ag_000
--------------
Type:   Continuous
Max:    3,376,892
p75:    0.0
Mean:   200
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ag_001
--------------
Type:   Continuous
Max:    10,472,522
p75:    0.0
Mean:   1,204
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ag_002
--------------
Type:   Continuous
Max:    19,149,160
p75:    0.0
Mean:   9,697
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ag_003
--------------
Type:   Continuous
Max:    73,057,472
p75:    0.0
Mean:   93,649
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ag_004
--------------
Type:   Continuous
Max:    228,830,570
p75:    50,226
Mean:   448,342
Median: 3,682
p25:    310
Min:    0.0

--------------
Column: ag_005
--------------
Type:   Continuous
Max:    179,187,978
p75:    916,692
Mean:   1,122,573
Median: 176,963
p25:    14,004
Min:    0.0

--------------
Column: ag_006
--------------
Type:   Continuous
Max:    94,020,666
p75:    1,887,378
Mean:   1,666,271
Median: 937,712
p25:    10,872
Min:    0.0

--------------
Column: ag_007
--------------
Type:   Continuous
Max:    63,346,754
p75:    590,030
Mean:   500,795
Median: 118,927
p25:    0.0
Min:    0.0

--------------
Column: ag_008
--------------
Type:   Continuous
Max:    17,702,522
p75:    26,930
Mean:   35,643
Median: 1,776
p25:    0.0
Min:    0.0

--------------
Column: ag_009
--------------
Type:   Continuous
Max:    25,198,514
p75:    374
Mean:   5,255
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ah_000
--------------
Type:   Continuous
Max:    82,073,576
p75:    1,604,618
Mean:   1,832,622
Median: 1,004,667
p25:    30,170
Min:    0.0

--------------
Column: ai_000
--------------
Type:   Continuous
Max:    17,770,090
p75:    0.0
Mean:   9,597
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: aj_000
--------------
Type:   Continuous
Max:    5,629,340
p75:    0.0
Mean:   1,167
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ak_000
--------------
Type:   Continuous
Max:    10,930,586
p75:    0.0
Mean:   968
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: al_000
--------------
Type:   Continuous
Max:    37,779,302
p75:    1,226
Mean:   61,605
Median: 0.0
p25:    0.0
Min:    0.0

------------
Column: am_0
------------
Type:   Continuous
Max:    55,903,508
p75:    2,394
Mean:   97,244
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: an_000
--------------
Type:   Continuous
Max:    149,868,570
p75:    3,135,014
Mean:   3,503,076
Median: 1,922,994
p25:    74,198
Min:    0.0

--------------
Column: ao_000
--------------
Type:   Continuous
Max:    142,751,664
p75:    2,682,324
Mean:   3,039,593
Median: 1,647,962
p25:    66,420
Min:    0.0

--------------
Column: ap_000
--------------
Type:   Continuous
Max:    115,613,054
p75:    726,100
Mean:   1,019,361
Median: 358,164
p25:    25,460
Min:    0.0

--------------
Column: aq_000
--------------
Type:   Continuous
Max:    34,558,772
p75:    377,362
Mean:   449,193
Median: 179,828
p25:    4,214
Min:    0.0

--------------
Column: ar_000
--------------
Type:   Continuous
Max:    350
p75:    0.0
Mean:   0.5
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: as_000
--------------
Type:   Continuous
Max:    6,383,704
p75:    0.0
Mean:   247
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: at_000
--------------
Type:   Continuous
Max:    10,654,346
p75:    0.0
Mean:   5,196
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: au_000
--------------
Type:   Continuous
Max:    5,711,474
p75:    0.0
Mean:   299
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: av_000
--------------
Type:   Continuous
Max:    794,458
p75:    648
Mean:   1,145
Median: 116
p25:    12
Min:    0.0

--------------
Column: ax_000
--------------
Type:   Continuous
Max:    116,652
p75:    264
Mean:   377
Median: 66
p25:    10
Min:    0.0

--------------
Column: ay_000
--------------
Type:   Continuous
Max:    74,041,092
p75:    0.0
Mean:   14,956
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ay_001
--------------
Type:   Continuous
Max:    80,525,378
p75:    0.0
Mean:   11,334
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ay_002
--------------
Type:   Continuous
Max:    30,231,168
p75:    0.0
Mean:   11,352
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ay_003
--------------
Type:   Continuous
Max:    13,945,170
p75:    0.0
Mean:   7,321
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ay_004
--------------
Type:   Continuous
Max:    40,028,704
p75:    0.0
Mean:   10,161
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ay_005
--------------
Type:   Continuous
Max:    124,948,914
p75:    39,808
Mean:   106,661
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ay_006
--------------
Type:   Continuous
Max:    127,680,326
p75:    1,269,102
Mean:   1,081,914
Median: 168,202
p25:    0.0
Min:    0.0

--------------
Column: ay_007
--------------
Type:   Continuous
Max:    489,678,156
p75:    1,339,224
Mean:   1,556,826
Median: 351,576
p25:    6,118
Min:    0.0

--------------
Column: ay_008
--------------
Type:   Continuous
Max:    326,836,844
p75:    620,060
Mean:   1,078,964
Median: 96,330
p25:    7,636
Min:    0.0

--------------
Column: ay_009
--------------
Type:   Continuous
Max:    18,824,656
p75:    0.0
Mean:   1,158
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: az_000
--------------
Type:   Continuous
Max:    10,124,620
p75:    4,164
Mean:   7,899
Median: 2,102
p25:    1,032
Min:    0.0

--------------
Column: az_001
--------------
Type:   Continuous
Max:    4,530,258
p75:    2,030
Mean:   4,486
Median: 642
p25:    60
Min:    0.0

--------------
Column: az_002
--------------
Type:   Continuous
Max:    14,217,662
p75:    3,144
Mean:   8,046
Median: 1,024
p25:    90
Min:    0.0

--------------
Column: az_003
--------------
Type:   Continuous
Max:    45,584,242
p75:    43,262
Mean:   87,721
Median: 3,604
p25:    296
Min:    0.0

--------------
Column: az_004
--------------
Type:   Continuous
Max:    132,037,276
p75:    1,777,978
Mean:   1,488,475
Median: 84,036
p25:    1,544
Min:    0.0

--------------
Column: az_005
--------------
Type:   Continuous
Max:    481,046,524
p75:    1,795,944
Mean:   2,162,875
Median: 529,990
p25:    39,066
Min:    0.0

--------------
Column: az_006
--------------
Type:   Continuous
Max:    64,589,140
p75:    4,284
Mean:   102,604
Median: 296
p25:    10
Min:    0.0

--------------
Column: az_007
--------------
Type:   Continuous
Max:    39,158,218
p75:    0.0
Mean:   17,884
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: az_008
--------------
Type:   Continuous
Max:    1,947,884
p75:    0.0
Mean:   634
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: az_009
--------------
Type:   Continuous
Max:    666,148
p75:    0.0
Mean:   37
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ba_000
--------------
Type:   Continuous
Max:    232,871,714
p75:    1,276,450
Mean:   1,414,577
Median: 681,582
p25:    33,716
Min:    0.0

--------------
Column: ba_001
--------------
Type:   Continuous
Max:    142,000,418
p75:    813,840
Mean:   904,111
Median: 445,210
p25:    15,175
Min:    0.0

--------------
Column: ba_002
--------------
Type:   Continuous
Max:    55,807,388
p75:    341,039
Mean:   417,998
Median: 186,752
p25:    5,305
Min:    0.0

--------------
Column: ba_003
--------------
Type:   Continuous
Max:    36,931,418
p75:    244,747
Mean:   277,535
Median: 134,820
p25:    1,882
Min:    0.0

--------------
Column: ba_004
--------------
Type:   Continuous
Max:    25,158,556
p75:    197,352
Mean:   207,682
Median: 102,432
p25:    628
Min:    0.0

--------------
Column: ba_005
--------------
Type:   Continuous
Max:    19,240,550
p75:    184,704
Mean:   190,997
Median: 84,380
p25:    380
Min:    0.0

--------------
Column: ba_006
--------------
Type:   Continuous
Max:    18,997,660
p75:    205,674
Mean:   211,596
Median: 70,426
p25:    356
Min:    0.0

--------------
Column: ba_007
--------------
Type:   Continuous
Max:    15,427,264
p75:    208,142
Mean:   186,797
Median: 4,670
p25:    74
Min:    0.0

--------------
Column: ba_008
--------------
Type:   Continuous
Max:    31,265,984
p75:    1,840
Mean:   36,271
Median: 22
p25:    0.0
Min:    0.0

--------------
Column: ba_009
--------------
Type:   Continuous
Max:    43,706,408
p75:    62
Mean:   37,121
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: bb_000
--------------
Type:   Continuous
Max:    234,981,844
p75:    3,871,060
Mean:   4,584,730
Median: 2,365,430
p25:    106,812
Min:    0.0

--------------
Column: bc_000
--------------
Type:   Continuous
Max:    396,952
p75:    138
Mean:   567
Median: 16
p25:    0.0
Min:    0.0

--------------
Column: bd_000
--------------
Type:   Continuous
Max:    306,452
p75:    440
Mean:   925
Median: 66
p25:    8.0
Min:    0.0

--------------
Column: be_000
--------------
Type:   Continuous
Max:    810,568
p75:    618
Mean:   1,382
Median: 180
p25:    18
Min:    0.0

--------------
Column: bf_000
--------------
Type:   Continuous
Max:    51,050
p75:    20
Mean:   76
Median: 2.0
p25:    0.0
Min:    0.0

--------------
Column: bg_000
--------------
Type:   Continuous
Max:    82,073,576
p75:    1,605,222
Mean:   1,832,518
Median: 1,005,052
p25:    30,170
Min:    0.0

--------------
Column: bh_000
--------------
Type:   Continuous
Max:    3,868,624
p75:    49,188
Mean:   58,777
Median: 26,462
p25:    864
Min:    0.0

--------------
Column: bi_000
--------------
Type:   Continuous
Max:    54,036,070
p75:    380,898
Mean:   496,961
Median: 180,566
p25:    16,070
Min:    0.0

--------------
Column: bj_000
--------------
Type:   Continuous
Max:    64,370,192
p75:    334,452
Mean:   520,377
Median: 154,840
p25:    8,682
Min:    0.0

--------------
Column: bk_000
--------------
Type:   Continuous
Max:    1,310,700
p75:    280,765
Mean:   280,380
Median: 210,720
p25:    162,900
Min:    0.0

--------------
Column: bl_000
--------------
Type:   Continuous
Max:    1,310,700
p75:    303,440
Mean:   321,169
Median: 222,800
p25:    170,860
Min:    0.0

--------------
Column: bm_000
--------------
Type:   Continuous
Max:    1,310,700
p75:    370,980
Mean:   399,816
Median: 239,120
p25:    172,100
Min:    0.0

--------------
Column: bn_000
--------------
Type:   Continuous
Max:    1,310,700
p75:    489,385
Mean:   462,365
Median: 251,400
p25:    171,240
Min:    0.0

--------------
Column: bo_000
--------------
Type:   Continuous
Max:    1,310,700
p75:    1,310,700
Mean:   511,080
Median: 270,300
p25:    170,790
Min:    0.0

--------------
Column: bp_000
--------------
Type:   Continuous
Max:    1,310,700
p75:    1,310,700
Mean:   548,709
Median: 287,520
p25:    172,160
Min:    0.0

--------------
Column: bq_000
--------------
Type:   Continuous
Max:    1,310,700
p75:    1,310,700
Mean:   579,698
Median: 303,960
p25:    171,300
Min:    0.0

--------------
Column: br_000
--------------
Type:   Continuous
Max:    1,310,700
p75:    1,310,700
Mean:   601,299
Median: 318,120
p25:    170,720
Min:    0.0

--------------
Column: bs_000
--------------
Type:   Continuous
Max:    1,037,240
p75:    118,960
Mean:   80,635
Median: 50,970
p25:    17,480
Min:    0.0

--------------
Column: bt_000
--------------
Type:   Continuous
Max:    42,949,672
p75:    48,910
Mean:   61,239
Median: 30,876
p25:    882
Min:    0.0

--------------
Column: bu_000
--------------
Type:   Continuous
Max:    234,981,844
p75:    3,868,154
Mean:   4,573,524
Median: 2,364,492
p25:    106,732
Min:    0.0

--------------
Column: bv_000
--------------
Type:   Continuous
Max:    234,981,844
p75:    3,868,154
Mean:   4,573,524
Median: 2,364,492
p25:    106,732
Min:    0.0

--------------
Column: bx_000
--------------
Type:   Continuous
Max:    531,835,592
p75:    3,648,142
Mean:   4,152,867
Median: 2,265,744
p25:    90,564
Min:    170

--------------
Column: by_000
--------------
Type:   Continuous
Max:    1,002,003
p75:    20,390
Mean:   22,219
Median: 12,648
p25:    220
Min:    0.0

--------------
Column: bz_000
--------------
Type:   Continuous
Max:    40,542,588
p75:    13,899
Mean:   103,441
Median: 1,052
p25:    8.0
Min:    0.0

--------------
Column: ca_000
--------------
Type:   Continuous
Max:    120,956
p75:    68,274
Mean:   39,323
Median: 25,563
p25:    6,986
Min:    0.0

--------------
Column: cb_000
--------------
Type:   Continuous
Max:    1,209,520
p75:    706,285
Mean:   406,832
Median: 280,250
p25:    78,080
Min:    0.0

--------------
Column: cc_000
--------------
Type:   Continuous
Max:    506,205,818
p75:    3,366,982
Mean:   3,844,509
Median: 2,117,140
p25:    63,131
Min:    0.0

--------------
Column: cd_000
--------------
Type:   Continuous
Max:    1,209,600
p75:    1,209,600
Mean:   1,209,600
Median: 1,209,600
p25:    1,209,600
Min:    1,209,600

--------------
Column: ce_000
--------------
Type:   Continuous
Max:    4,908,098
p75:    87,379
Mean:   64,279
Median: 3,408
p25:    264
Min:    0.0

--------------
Column: cf_000
--------------
Type:   Continuous
Max:    8,584,297,736
p75:    2.0
Mean:   150,231
Median: 2.0
p25:    0.0
Min:    0.0

--------------
Column: cg_000
--------------
Type:   Continuous
Max:    21,400
p75:    102
Mean:   91
Median: 46
p25:    8.0
Min:    0.0

--------------
Column: ch_000
--------------
Type:   Continuous
Max:    2.0
p75:    0.0
Mean:   0.0
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ci_000
--------------
Type:   Continuous
Max:    140,986,129
p75:    2,951,239
Mean:   3,518,083
Median: 1,864,677
p25:    49,018
Min:    0.0

--------------
Column: cj_000
--------------
Type:   Continuous
Max:    74,498,678
p75:    0.0
Mean:   105,858
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ck_000
--------------
Type:   Continuous
Max:    55,428,669
p75:    551,183
Mean:   723,505
Median: 251,373
p25:    14,743
Min:    0.0

--------------
Column: cl_000
--------------
Type:   Continuous
Max:    130,560
p75:    2.0
Mean:   360
Median: 0.0
p25:    0.0
Min:    0.0

-------------
Column: class
-------------
Type:      Categorical
Frequency: 98.19%, Label: neg
Other Labels: 1.81%

--------------
Column: cm_000
--------------
Type:   Continuous
Max:    98,124
p75:    100
Mean:   346
Median: 8.0
p25:    0.0
Min:    0.0

--------------
Column: cn_000
--------------
Type:   Continuous
Max:    12,567,090
p75:    0.0
Mean:   2,678
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: cn_001
--------------
Type:   Continuous
Max:    45,317,856
p75:    0.0
Mean:   24,041
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: cn_002
--------------
Type:   Continuous
Max:    83,977,900
p75:    8,049
Mean:   166,081
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: cn_003
--------------
Type:   Continuous
Max:    160,348,272
p75:    235,337
Mean:   541,879
Median: 34,962
p25:    4,654
Min:    0.0

--------------
Column: cn_004
--------------
Type:   Continuous
Max:    169,869,316
p75:    1,208,926
Mean:   1,298,715
Median: 519,680
p25:    19,532
Min:    0.0

--------------
Column: cn_005
--------------
Type:   Continuous
Max:    117,815,764
p75:    1,519,190
Mean:   1,348,579
Median: 703,196
p25:    5,194
Min:    0.0

--------------
Column: cn_006
--------------
Type:   Continuous
Max:    72,080,406
p75:    449,859
Mean:   410,362
Median: 97,184
p25:    634
Min:    0.0

--------------
Column: cn_007
--------------
Type:   Continuous
Max:    33,143,734
p75:    31,079
Mean:   64,725
Median: 10,016
p25:    64
Min:    0.0

--------------
Column: cn_008
--------------
Type:   Continuous
Max:    9,628,690
p75:    5,288
Mean:   19,651
Median: 1,858
p25:    0.0
Min:    0.0

--------------
Column: cn_009
--------------
Type:   Continuous
Max:    36,398,374
p75:    294
Mean:   8,018
Median: 26
p25:    0.0
Min:    0.0

--------------
Column: co_000
--------------
Type:   Continuous
Max:    8,584,297,742
p75:    74
Mean:   150,517
Median: 8.0
p25:    0.0
Min:    0.0

--------------
Column: cp_000
--------------
Type:   Continuous
Max:    496,360
p75:    82
Mean:   552
Median: 14
p25:    4.0
Min:    0.0

--------------
Column: cq_000
--------------
Type:   Continuous
Max:    234,981,844
p75:    3,868,154
Mean:   4,573,524
Median: 2,364,492
p25:    106,732
Min:    0.0

--------------
Column: cr_000
--------------
Type:   Continuous
Max:    57,450
p75:    0.0
Mean:   39
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: cs_000
--------------
Type:   Continuous
Max:    924,996
p75:    5,702
Mean:   5,522
Median: 3,200
p25:    1,234
Min:    0.0

--------------
Column: cs_001
--------------
Type:   Continuous
Max:    438,806
p75:    694
Mean:   792
Median: 362
p25:    32
Min:    0.0

--------------
Column: cs_002
--------------
Type:   Continuous
Max:    65,572,940
p75:    94,964
Mean:   243,057
Median: 20,804
p25:    232
Min:    0.0

--------------
Column: cs_003
--------------
Type:   Continuous
Max:    61,737,200
p75:    296,150
Mean:   360,247
Median: 122,272
p25:    3,138
Min:    0.0

--------------
Column: cs_004
--------------
Type:   Continuous
Max:    152,455,352
p75:    209,253
Mean:   452,960
Median: 91,325
p25:    2,772
Min:    0.0

--------------
Column: cs_005
--------------
Type:   Continuous
Max:    379,142,116
p75:    2,051,776
Mean:   2,257,808
Median: 1,225,808
p25:    19,493
Min:    0.0

--------------
Column: cs_006
--------------
Type:   Continuous
Max:    73,741,974
p75:    687,733
Mean:   547,189
Median: 241,496
p25:    13,482
Min:    0.0

--------------
Column: cs_007
--------------
Type:   Continuous
Max:    12,884,218
p75:    18,174
Mean:   14,724
Median: 6,132
p25:    1,208
Min:    0.0

--------------
Column: cs_008
--------------
Type:   Continuous
Max:    2,826,620
p75:    148
Mean:   247
Median: 46
p25:    2.0
Min:    0.0

--------------
Column: cs_009
--------------
Type:   Continuous
Max:    44,902,992
p75:    0.0
Mean:   1,024
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ct_000
--------------
Type:   Continuous
Max:    910,366
p75:    672
Mean:   749
Median: 212
p25:    40
Min:    0.0

--------------
Column: cu_000
--------------
Type:   Continuous
Max:    733,688
p75:    858
Mean:   1,217
Median: 280
p25:    82
Min:    0.0

--------------
Column: cv_000
--------------
Type:   Continuous
Max:    81,610,510
p75:    2,402,269
Mean:   1,930,346
Median: 1,186,436
p25:    24,468
Min:    0.0

--------------
Column: cx_000
--------------
Type:   Continuous
Max:    44,105,494
p75:    126,821
Mean:   353,598
Median: 44,669
p25:    944
Min:    0.0

--------------
Column: cy_000
--------------
Type:   Continuous
Max:    931,472
p75:    0.0
Mean:   255
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: cz_000
--------------
Type:   Continuous
Max:    19,156,530
p75:    6,250
Mean:   18,878
Median: 204
p25:    4.0
Min:    0.0

--------------
Column: da_000
--------------
Type:   Continuous
Max:    22,458
p75:    0.0
Mean:   7.5
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: db_000
--------------
Type:   Continuous
Max:    9,636
p75:    18
Mean:   13
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: dc_000
--------------
Type:   Continuous
Max:    120,759,484
p75:    2,645,840
Mean:   2,202,093
Median: 1,743,089
p25:    26,907
Min:    0.0

--------------
Column: dd_000
--------------
Type:   Continuous
Max:    445,142
p75:    2,688
Mean:   3,177
Median: 1,360
p25:    132
Min:    0.0

--------------
Column: de_000
--------------
Type:   Continuous
Max:    176,176
p75:    296
Mean:   374
Median: 144
p25:    66
Min:    0.0

--------------
Column: df_000
--------------
Type:   Continuous
Max:    203,500,780
p75:    0.0
Mean:   6,999
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: dg_000
--------------
Type:   Continuous
Max:    27,064,294
p75:    0.0
Mean:   7,034
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: dh_000
--------------
Type:   Continuous
Max:    124,700,880
p75:    0.0
Mean:   4,302
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: di_000
--------------
Type:   Continuous
Max:    22,987,424
p75:    0.0
Mean:   36,887
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: dj_000
--------------
Type:   Continuous
Max:    726,750
p75:    0.0
Mean:   31
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: dk_000
--------------
Type:   Continuous
Max:    5,483,574
p75:    0.0
Mean:   1,659
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: dl_000
--------------
Type:   Continuous
Max:    103,858,120
p75:    0.0
Mean:   28,872
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: dm_000
--------------
Type:   Continuous
Max:    23,697,916
p75:    0.0
Mean:   7,841
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: dn_000
--------------
Type:   Continuous
Max:    2,924,584
p75:    27,384
Mean:   34,248
Median: 14,384
p25:    666
Min:    0.0

--------------
Column: do_000
--------------
Type:   Continuous
Max:    2,472,198
p75:    37,718
Mean:   28,809
Median: 10,493
p25:    20
Min:    0.0

--------------
Column: dp_000
--------------
Type:   Continuous
Max:    535,316
p75:    8,332
Mean:   7,029
Median: 2,540
p25:    6.0
Min:    0.0

--------------
Column: dq_000
--------------
Type:   Continuous
Max:    6,351,872,864
p75:    0.0
Mean:   4,355,312
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: dr_000
--------------
Type:   Continuous
Max:    50,137,662
p75:    0.0
Mean:   203,118
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ds_000
--------------
Type:   Continuous
Max:    4,970,962
p75:    99,272
Mean:   90,446
Median: 48,174
p25:    700
Min:    0.0

--------------
Column: dt_000
--------------
Type:   Continuous
Max:    855,260
p75:    17,638
Mean:   15,527
Median: 8,360
p25:    152
Min:    0.0

--------------
Column: du_000
--------------
Type:   Continuous
Max:    460,207,620
p75:    3,500,335
Mean:   4,120,839
Median: 188,110
p25:    5,520
Min:    0.0

--------------
Column: dv_000
--------------
Type:   Continuous
Max:    127,034,534
p75:    536,555
Mean:   604,406
Median: 31,146
p25:    758
Min:    0.0

--------------
Column: dx_000
--------------
Type:   Continuous
Max:    114,288,420
p75:    8,972
Mean:   789,470
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: dy_000
--------------
Type:   Continuous
Max:    3,793,022
p75:    36
Mean:   7,840
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: dz_000
--------------
Type:   Continuous
Max:    1,414
p75:    0.0
Mean:   0.2
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ea_000
--------------
Type:   Continuous
Max:    8,506
p75:    0.0
Mean:   1.5
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: eb_000
--------------
Type:   Continuous
Max:    4,496,965,920
p75:    4,037,040
Mean:   9,924,877
Median: 628,980
p25:    0.0
Min:    0.0

-------------
Column: ec_00
-------------
Type:   Continuous
Max:    106,020
p75:    1,382
Mean:   1,358
Median: 760
p25:    115
Min:    0.0

--------------
Column: ed_000
--------------
Type:   Continuous
Max:    88,388
p75:    1,510
Mean:   1,464
Median: 836
p25:    98
Min:    0.0

--------------
Column: ee_000
--------------
Type:   Continuous
Max:    163,085,734
p75:    574,486
Mean:   744,503
Median: 261,804
p25:    15,902
Min:    0.0

--------------
Column: ee_001
--------------
Type:   Continuous
Max:    98,224,378
p75:    667,812
Mean:   788,310
Median: 348,474
p25:    8,620
Min:    0.0

--------------
Column: ee_002
--------------
Type:   Continuous
Max:    77,933,926
p75:    438,674
Mean:   449,576
Median: 235,448
p25:    2,986
Min:    0.0

--------------
Column: ee_003
--------------
Type:   Continuous
Max:    37,758,390
p75:    218,410
Mean:   213,246
Median: 112,672
p25:    1,184
Min:    0.0

--------------
Column: ee_004
--------------
Type:   Continuous
Max:    97,152,378
p75:    467,634
Mean:   450,647
Median: 223,002
p25:    2,730
Min:    0.0

--------------
Column: ee_005
--------------
Type:   Continuous
Max:    57,435,236
p75:    403,290
Mean:   400,620
Median: 190,986
p25:    3,646
Min:    0.0

--------------
Column: ee_006
--------------
Type:   Continuous
Max:    42,159,442
p75:    276,180
Mean:   337,868
Median: 93,536
p25:    530
Min:    0.0

--------------
Column: ee_007
--------------
Type:   Continuous
Max:    119,580,108
p75:    168,046
Mean:   347,561
Median: 41,260
p25:    112
Min:    0.0

--------------
Column: ee_008
--------------
Type:   Continuous
Max:    19,267,396
p75:    139,500
Mean:   139,896
Median: 3,862
p25:    0.0
Min:    0.0

--------------
Column: ee_009
--------------
Type:   Continuous
Max:    4,570,398
p75:    2,000
Mean:   8,424
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: ef_000
--------------
Type:   Continuous
Max:    482
p75:    0.0
Mean:   0.1
Median: 0.0
p25:    0.0
Min:    0.0

--------------
Column: eg_000
--------------
Type:   Continuous
Max:    1,720
p75:    0.0
Mean:   0.2
Median: 0.0
p25:    0.0
Min:    0.0

Data Valuation

In [4]:
df.kxy.data_valuation(y_column, problem_type=problem_type)
[====================================================================================================] 100% ETA: 0s    Duration: 0s
Out[4]:
Achievable R-Squared Achievable Log-Likelihood Per Sample Achievable Accuracy
0 0.75 6.03e-01 1.00

Automatic (Model-Free) Variable Selection

In [5]:
df.kxy.variable_selection(y_column, problem_type=problem_type)
[====================================================================================================] 100% ETA: 0s    Duration: 0s
Out[5]:
Variable Running Achievable R-Squared Running Achievable Accuracy
Selection Order
0 No Variable 0.00 0.98
1 ag_001 0.06 0.99
2 au_000 0.06 0.99
3 aa_000 0.06 0.99
4 ay_009 0.06 0.99
... ... ... ...
166 cs_004 0.75 1.00
167 az_008 0.75 1.00
168 ay_005 0.75 1.00
169 cs_008 0.75 1.00
170 dq_000 0.75 1.00

171 rows × 3 columns