Kernel 5.0 performance getting worse

OK , Browser scale to 170% I can read the text :smiley: :wink:

1 Like

Which hardware are you using ?
And which benchmark script ?

1 Like

write @"user" like @Keruskerfuerst you mean ???

Since you have an Intel CPU, can you check FLOPS with Linpack, ideally in runlevel 3 and by using at least half of your available RAM?
On a sidenote, disabling page table isolation (KPTI) can drastically improve performance on older Intels, but no so much on Haswell+.

Lines of code in the kernel source tree has no relation to the size or performance of a compiled kernel.

Most drivers are compiled as modules and loaded only when needed.

3 Likes

The script is using darktable to convert a raw file to jpg. It is measuring the time darktable needs to excecute the pixelpipe. This is a pure CPU/GPU activity. No disc i/o involved. It also scales well with the RAM clock speed.

The script needs a raw file and and xmp file which tells darktable which modules and parameters to apply to the raw file. Needless to say that this has a direct impact on the speed of the run. The files should be the same in each run. I am using bench.SRW and bench.SRW.xmp. Both files have to be in the working directory. I always use the same working directory for all benchmarks.

The idea is based on this thread:

and the files can be found here:

http://www.mirada.ch/bench.SRW
http://www.mirada.ch/bench.SRW.xmp

And this is the script I am using:

bench-script-sequence-3.sh:
this script disables opencl and hence darktable is running only on CPU. This is the script I use when I am talking about kernel performance.

rm -f test*.jpg
for d in $(seq 1 3); do
	echo -n "run $d: "
 	darktable-cli bench.SRW test-$d.jpg --core --disable-opencl -d perf -d opencl | grep "processing took"
done

A typical output looks like this:

4# ./bench-script-sequence-3.sh
run 1: 15,532968 [dev_process_export] pixel pipeline processing took 14,968 secs (116,138 CPU)
run 2: 15,293704 [dev_process_export] pixel pipeline processing took 14,976 secs (116,147 CPU)
run 3: 15,299993 [dev_process_export] pixel pipeline processing took 14,982 secs (116,102 CPU)

Interesting is the effect of opencl. When turned on it makes my darktable faster by a factor of 2 (my PC: i7-7700k + nvidia GTX 1050 Ti). Just remove --disable-opencl from the darktable commandline to try it out.

Happy benchmarking! :wink:

7 Likes

Cool idea to use DarkTable!
Maybe I'll integrate this into my benchmark script, which I will publish once it works completely.

My old PC is quick???? :wink:

~/Bilder >>> bench-script-sequence-3                                                                                                
run 1: [defaults] found a 64-bit system with 16362016 kb ram and 8 cores (0 atom based)
[defaults] setting very high quality defaults
1,023878 [dev_process_export] pixel pipeline processing took 0,544 secs (3,344 CPU)
run 2: 0,943988 [dev_process_export] pixel pipeline processing took 0,534 secs (3,392 CPU)
run 3: 0,937706 [dev_process_export] pixel pipeline processing took 0,521 secs (3,417 CPU)

or I make a mistake :smiley:

2nd

~/Bilder >>> bench-script-sequence-3                                                                                                
run 1: 0,970432 [dev_process_export] pixel pipeline processing took 0,555 secs (3,441 CPU)
run 2: 0,951806 [dev_process_export] pixel pipeline processing took 0,528 secs (3,464 CPU)
run 3: 0,954701 [dev_process_export] pixel pipeline processing took 0,533 secs (3,484 CPU)

grafik

i7-3770K + GTX 1060 6GB

I dont think so. It more looks like that DT is not running through the full pixelpipe.

I suggest you look at the full debug output. Go to the same working directory and excecute:

darktable-cli bench.SRW test.jpg --core --disable-opencl -d perf -d opencl

And if is not help, execute with full debugging:

darktable-cli bench.SRW test.jpg --core --disable-opencl -d all

EDIT:
Your jpg size also difference considerably from what I get. my jpg are around 1,4 MB. Yours are at 7,4 MB. Something is odd here.

1 Like

Your bench.SRW.xmp seems to be wrong. The file should have a size of 6,4 kB but not 1,2 kB.

Problem with download, :slight_smile: Please can you post as text here?

<?xml version="1.0" encoding="UTF-8"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 4.4.0-Exiv2">
 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <rdf:Description rdf:about=""
    xmlns:xmp="http://ns.adobe.com/xap/1.0/"
    xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"
    xmlns:darktable="http://darktable.sf.net/"
   xmp:Rating="1"
   xmpMM:DerivedFrom="bench.SRW"
   darktable:xmp_version="2"
   darktable:raw_params="0"
   darktable:auto_presets_applied="1"
   darktable:history_end="4">
   <darktable:mask_id>
    <rdf:Seq/>
   </darktable:mask_id>
   <darktable:mask_type>
    <rdf:Seq/>
   </darktable:mask_type>
   <darktable:mask_name>
    <rdf:Seq/>
   </darktable:mask_name>
   <darktable:mask_version>
    <rdf:Seq/>
   </darktable:mask_version>
   <darktable:mask>
    <rdf:Seq/>
   </darktable:mask>
   <darktable:mask_nb>
    <rdf:Seq/>
   </darktable:mask_nb>
   <darktable:mask_src>
    <rdf:Seq/>
   </darktable:mask_src>
   <darktable:history>
    <rdf:Seq>
     <rdf:li
      darktable:operation="flip"
      darktable:enabled="1"
      darktable:modversion="2"
      darktable:params="ffffffff"
      darktable:multi_name=""
      darktable:multi_priority="0"
      darktable:blendop_version="8"
      darktable:blendop_params="gz11eJxjYGBgkGAAgRNODGiAEV0AJ2iwh+CRxQcA5qIZBA=="/>
     <rdf:li
      darktable:operation="sharpen"
      darktable:enabled="1"
      darktable:modversion="1"
      darktable:params="000000400000003f0000003f"
      darktable:multi_name=""
      darktable:multi_priority="0"
      darktable:blendop_version="8"
      darktable:blendop_params="gz11eJxjYGBgkGAAgRNODGiAEV0AJ2iwh+CRxQcA5qIZBA=="/>
     <rdf:li
      darktable:operation="basecurve"
      darktable:enabled="1"
      darktable:modversion="5"
      darktable:params="gz09eJxjYICA3Zqqtnqyn20MnTjsuK/m2oVpPrWrSLS3/7g3HIjL7RkYGqB4FAwlwIbEZsKQhcQpAAS6D6o="
      darktable:multi_name=""
      darktable:multi_priority="0"
      darktable:blendop_version="8"
      darktable:blendop_params="gz11eJxjYGBgkGAAgRNODGiAEV0AJ2iwh+CRxQcA5qIZBA=="/>
     <rdf:li
      darktable:operation="exposure"
      darktable:enabled="1"
      darktable:modversion="5"
      darktable:params="00000000accfd53c3889213f00004842000080c0"
      darktable:multi_name=""
      darktable:multi_priority="0"
      darktable:blendop_version="8"
      darktable:blendop_params="gz11eJxjYGBgkGAAgRNODGiAEV0AJ2iwh+CRxQcA5qIZBA=="/>
    </rdf:Seq>
   </darktable:history>
  </rdf:Description>
 </rdf:RDF>
</x:xmpmeta>

https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.0-Net-Perf-Bisect

2 Likes

Here you go!

bench.SRW.xmp

<?xml version="1.0" encoding="UTF-8"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 4.4.0-Exiv2">
 <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <rdf:Description rdf:about=""
    xmlns:xmp="http://ns.adobe.com/xap/1.0/"
    xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"
    xmlns:darktable="http://darktable.sf.net/"
   xmp:Rating="1"
   xmpMM:DerivedFrom="SAM_1580.SRW"
   darktable:xmp_version="1"
   darktable:raw_params="0"
   darktable:auto_presets_applied="1"
   darktable:history_end="17">
   <darktable:mask_id>
    <rdf:Seq/>
   </darktable:mask_id>
   <darktable:mask_type>
    <rdf:Seq/>
   </darktable:mask_type>
   <darktable:mask_name>
    <rdf:Seq/>
   </darktable:mask_name>
   <darktable:mask_version>
    <rdf:Seq/>
   </darktable:mask_version>
   <darktable:mask>
    <rdf:Seq/>
   </darktable:mask>
   <darktable:mask_nb>
    <rdf:Seq/>
   </darktable:mask_nb>
   <darktable:mask_src>
    <rdf:Seq/>
   </darktable:mask_src>
   <darktable:history_modversion>
    <rdf:Seq>
     <rdf:li>2</rdf:li>
     <rdf:li>2</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>5</rdf:li>
     <rdf:li>3</rdf:li>
     <rdf:li>3</rdf:li>
     <rdf:li>2</rdf:li>
     <rdf:li>2</rdf:li>
     <rdf:li>2</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>3</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>2</rdf:li>
     <rdf:li>3</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>2</rdf:li>
     <rdf:li>5</rdf:li>
    </rdf:Seq>
   </darktable:history_modversion>
   <darktable:history_enabled>
    <rdf:Seq>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
     <rdf:li>1</rdf:li>
    </rdf:Seq>
   </darktable:history_enabled>
   <darktable:history_operation>
    <rdf:Seq>
     <rdf:li>flip</rdf:li>
     <rdf:li>basecurve</rdf:li>
     <rdf:li>sharpen</rdf:li>
     <rdf:li>shadhi</rdf:li>
     <rdf:li>colorreconstruct</rdf:li>
     <rdf:li>demosaic</rdf:li>
     <rdf:li>highlights</rdf:li>
     <rdf:li>temperature</rdf:li>
     <rdf:li>levels</rdf:li>
     <rdf:li>bilat</rdf:li>
     <rdf:li>globaltonemap</rdf:li>
     <rdf:li>tonemap</rdf:li>
     <rdf:li>colorcontrast</rdf:li>
     <rdf:li>colorzones</rdf:li>
     <rdf:li>atrous</rdf:li>
     <rdf:li>nlmeans</rdf:li>
     <rdf:li>lens</rdf:li>
    </rdf:Seq>
   </darktable:history_operation>
   <darktable:history_params>
    <rdf:Seq>
     <rdf:li>ffffffff</rdf:li>
     <rdf:li>gz09eJxjYICA3Zqqtnqyn20MnTjsuK/m2oVpPrWrSLS3/7g3HIjL7RkYGqB4FAwlwIbEZkJiAwBVIQ4s</rdf:li>
     <rdf:li>000000400000003f0000003f</rdf:li>
     <rdf:li>000000000000c842cccc644200000000c3f55cc200000000000048420000c842000048427f000000bd37863500000000</rdf:li>
     <rdf:li>0000c8421f65d44300002041c3f5283f00000000</rdf:li>
     <rdf:li>000000000ad7a33c000000000100000000000000</rdf:li>
     <rdf:li>020000000000803f00000000000000004c37893f</rdf:li>
     <rdf:li>00409c45000026400000803f0400cd3f</rdf:li>
     <rdf:li>0000000000000000000048420000c8427d55a23cd9120d3f0000803f</rdf:li>
     <rdf:li>0000c04100005442cdcc4c3e</rdf:li>
     <rdf:li>02000000f6285c3faec7d742a0ef273d</rdf:li>
     <rdf:li>df4f1d409a990942</rdf:li>
     <rdf:li>ae47813f00000000c3f5883f0000000001000000</rdf:li>
     <rdf:li>gz03eJxjYoAAVU8hO1XPSXbbc28DaSH7vNtm9ttzo+0ZGBrsqSCPF6vpPrRT0X1jtziRzb70uIj9pnoWe28hBntWqBwh/SD7AZmdKRc=</rdf:li>
     <rdf:li>gz03eJxjZ4CAs2d87M6eOWM3a6akPZBtz8DQYE8HcfvVq7jsQ0NFgWwFIFsLyDYFi5OIGTo7OGw7OzpsgXYAaQ47oFl2UHvBYkB77UBuANprBwB1w0kn</rdf:li>
     <rdf:li>0000004000005042d7a3f03e0000803f</rdf:li>
     <rdf:li>gz04eJyzZoCBBvsdckftgbQjA0OCAwNDlQsjUNQvwtjAgGHgQHBibnFpXrqCoVlurkKavpGeCYOJQkBiXnJidir9XNEAChd7UHgAAEoJEDw=</rdf:li>
    </rdf:Seq>
   </darktable:history_params>
   <darktable:blendop_params>
    <rdf:Seq>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
     <rdf:li>gz12eJxjYGBgkGAAgRNODESDBnsIHll8ANNSGQM=</rdf:li>
    </rdf:Seq>
   </darktable:blendop_params>
   <darktable:blendop_version>
    <rdf:Seq>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
     <rdf:li>7</rdf:li>
    </rdf:Seq>
   </darktable:blendop_version>
   <darktable:multi_priority>
    <rdf:Seq>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
     <rdf:li>0</rdf:li>
    </rdf:Seq>
   </darktable:multi_priority>
   <darktable:multi_name>
    <rdf:Seq>
     <rdf:li> </rdf:li>
     <rdf:li> </rdf:li>
     <rdf:li> </rdf:li>
     <rdf:li/>
     <rdf:li/>
     <rdf:li/>
     <rdf:li/>
     <rdf:li/>
     <rdf:li/>
     <rdf:li/>
     <rdf:li/>
     <rdf:li/>
     <rdf:li/>
     <rdf:li/>
     <rdf:li/>
     <rdf:li/>
     <rdf:li/>
    </rdf:Seq>
   </darktable:multi_name>
  </rdf:Description>
 </rdf:RDF>
</x:xmpmeta>
1 Like

Thank you, :slight_smile:
lucky again, was afraid I wouldn't need to buy a Threadripper 3rd Generation :smiley:

~/Bilder >>> bench-script-sequence-3                                                   
run 1: 27,956996 [dev_process_export] pixel pipeline processing took 27,534 secs (190,895 CPU)
run 2: 28,756199 [dev_process_export] pixel pipeline processing took 28,283 secs (173,685 CPU)
run 3: 28,355852 [dev_process_export] pixel pipeline processing took 27,908 secs (179,980 CPU)

grafik
mmhh, pic 3,9 MB

1 Like

I just run the test twice and get also 3.9 MB. Did you change the compression or something in darktable?

run 1: 13,304864 [dev_process_export] pixel pipeline processing took 12,919 secs (192,280 CPU)
run 2: 13,249682 [dev_process_export] pixel pipeline processing took 12,863 secs (192,921 CPU)
run 3: 13,264564 [dev_process_export] pixel pipeline processing took 12,875 secs (192,494 CPU)
1 Like

FYI.... With the CPU Wars going on between Intel and AMD, new and used CPU's alike are dropping in price. I was lucky 3 years ago and purchased a Xeon for $160 that is equivalent to an I7-5960x. There are bargains everywhere these days!!!!

2 Likes

Everything under 6 cylinders (64 cores) is a rollator :wink:

OT ;-)

g3277

3 Likes

I am still trying to figure out the details. What I can tell so far is that when I remove my darktable config directory (~/.config/darktable) I see the same file sizes like you: 3,8 MB. With my own config directory in place I see 1,4 MB.

There is just one meaningful difference in the exif of both files:

1,4 MB file:
Y Cb Cr Sub Sampling : YCbCr4:2:0 (2 2)

3,8 MB file:
Y Cb Cr Sub Sampling : YCbCr4:4:4 (1 1)
This seems to be the default for darktable.

When I start DT with an empty config directory it says:

[defaults] found a 64-bit system with 32898040 kb ram and 8 cores (0 atom based)
[defaults] setting very high quality defaults

The only difference I see in the GUI is that the default jpg export quality is 95 % while my personal config says 80 %. May be that is resulting in different Y Cb Cr Sub Sampling. I dont know.

And it is also intersting to see that the default settings make things faster:

With default settings (95 % jpg):

run 1: 14,548838 [dev_process_export] pixel pipeline processing took 14,180 secs (109,000 CPU)
run 2: 14,564925 [dev_process_export] pixel pipeline processing took 14,197 secs (109,056 CPU)
run 3: 14,514895 [dev_process_export] pixel pipeline processing took 14,145 secs (108,639 CPU)

with my settings (80 % jpg)

run 1: 15,732961 [dev_process_export] pixel pipeline processing took 15,413 secs (117,769 CPU)
run 2: 15,545543 [dev_process_export] pixel pipeline processing took 15,221 secs (117,103 CPU)
run 3: 15,530032 [dev_process_export] pixel pipeline processing took 15,209 secs (117,078 CPU)

Anyways, your file sizes seem to be accurate.

2 Likes

The Xeon I purchased 3 years ago has 12c/24t. It was the best bang for buck at the time for me. I've got my eye on a 20c/40t Xeon this is compatible with my motherboard. I've waiting for it to drop in price. 14 months ago, it was priced @ $1,100. Now, it is $460. I'm waiting for it to drop to below $200. I made a commitment, years ago, to never pay more than $200 for a CPU. I stick a generation or two behind current tech. (1) I save money. and... (2) It has been thoroughly tested and supported by all.

2 Likes

I made a change to the darktable benchmark script.

I added darktable parameter --configdir /dev/null to make it independent of a users darktable config files.

rm -f test*.jpg
for d in $(seq 1 3); do
    echo -n "run $d: "
    darktable-cli bench.SRW test-$d.jpg --core --configdir /dev/null --disable-opencl -d perf -d opencl 2>/dev/null | grep "processing took"
done

Please use this script so achieve comparable results.

1 Like

Forum kindly sponsored by