XGBoost's cuda acceleration

jmorrill · November 12, 2019, 7:16pm

It seems XGBoost supports GPU acceleration via cuda (9?) with the gpu_hist parameter to xgb_params

In xgboost_code_model.py I added: ‘tree_method’: ‘gpu_hist’ and ran a few tests (16 core, 1080gtx)

WITH 'gpu_hist'

First run:

[Task 1/42] Current/Best: 178.70/2169.96 GFLOPS | Progress: (256/256) | 901.07 s Done.

Second run:

[Task 1/42] Current/Best: 1669.95/1804.79 GFLOPS | Progress: (256/256) | 904.57 s Done.

WITHOUT 'gpu_hist'

First run: [Task 1/42] Current/Best: 48.44/1714.60 GFLOPS | Progress: (256/256) | 980.04 s Done.

Second Run:

[Task 1/42] Current/Best: 113.77/1672.49 GFLOPS | Progress: (256/256) | 1038.44 s Done.

Even though I only run each test twice, you do see the ‘gpu_hist’ does complete a bit faster. I did see the cuda usage on my GPU use about 2 - 4% when running the xgboost cost model. Is this something that should be exposed in the public API or was there a reason why it was excluded?

Can someone else verify benefit?

KimBioInfoStudio · November 20, 2019, 3:46am

I think u misunderstood the intention, xgb only for finding the better parmas for ops, so the 1st run can’t show your needed info, actually, graph tuner will really run with the parmas on your RPC to get the computation ability. Am I right ?@tqchen

jmorrill · November 20, 2019, 4:51am

I’m saying that enabling GPU acceleration is faster computing the jobs vs multicpu. Problem I’m finding is it hard crashes when it goes to the next task.

comaniac · November 20, 2019, 5:30am

I don’t think that would help a lot tho. GPU acceleration for XGBoost is to accelerate the cost model training, but the auto-tuning bottleneck is compilation and measurement instead of determining the next batch of candidates. Such acceleration could be moderated easily by server status. For example, I have an experience of tuning an op for 2,000 trials on V100. The tuning used to be done in 2 hours (~3.6 sec / trial on average), but the same task can take almost 3 hours (~5 sec / trial on average) on the same server sometimes.