how to pick cipher for AES-NI enabled AMD GX-412TC SOC tincd at 100% CPU
Jelle de Jong
jelledejong at powercraft.nl
Sat May 9 12:25:07 CEST 2020
Hello everybody,
I would also love to know how I can optimize my tinc setup so it goes
faster without using 100% CPU load for 10MB/s...
Kind regards,
Jelle de Jong
On 2020-04-04 21:33, Jelle de Jong wrote:
> Hello everybody,
>
> Thank you Fufu Fang for your quick reply:
>
> With tinc version 1.0.35 and the bellow options at 100% CPu load i get
> about 10 MB/s...
>
> PMTU = 1400
> PMTUDiscovery = yes
> #Cipher = none
> Cipher = chacha20-poly1305
> Digest = blake2b512
>
> Tried Cipher = none as well and also got 10MB/s with 100% CPU on one
> thread the other three available threads are idle.
>
> With inc_1.1~pre17-1.1_amd64.deb and libssl1.1:amd64 1.1.1d-0+deb10u2 I
> get the following error:
>
> Apr 04 19:03:19 officelink01 tincd[522]: Error while decrypting:
> error:060A7094:digital envelope routines:EVP_EncryptUpdate:invalid
> operation
>
> installation steps:
> wget
> http://ftp.nl.debian.org/debian/pool/main/t/tinc/tinc_1.1~pre17-1.1_amd64.deb
>
> dpkg -i tinc_1.1~pre17-1.1_amd64.deb
> apt-get -f install
>
> Any speed improvement ideas?
>
> Kind regards,
>
> Jelle
>
> On 2020-04-04 20:02, Jelle de Jong wrote:
>> Hello everybody,
>>
>> First a big thanks for tinc-vpn I am still using it next to wireguard
>> and openvpn.
>>
>> I am having a setup where the tinc debian appliance is at 100% cpu
>> load doing about 7.5MB/s.
>>
>> Compression = 9
>> PMTU = 1400
>> PMTUDiscovery = yes
>> Cipher = aes-128-cbc
>>
>> How can I pick a cipher that is the fasted for my CPU and don't create
>> a CPU bottleneck at 100%.
>>
>> Kind regards,
>>
>> Jelle de Jong
>>
>> root at officelink01:~# lscpu
>> Architecture: x86_64
>> CPU op-mode(s): 32-bit, 64-bit
>> Byte Order: Little Endian
>> Address sizes: 40 bits physical, 48 bits virtual
>> CPU(s): 4
>> On-line CPU(s) list: 0-3
>> Thread(s) per core: 1
>> Core(s) per socket: 4
>> Socket(s): 1
>> NUMA node(s): 1
>> Vendor ID: AuthenticAMD
>> CPU family: 22
>> Model: 48
>> Model name: AMD GX-412TC SOC
>> Stepping: 1
>> CPU MHz: 775.729
>> CPU max MHz: 1000.0000
>> CPU min MHz: 600.0000
>> BogoMIPS: 1996.08
>> Virtualization: AMD-V
>> L1d cache: 32K
>> L1i cache: 32K
>> L2 cache: 2048K
>> NUMA node0 CPU(s): 0-3
>> Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>> pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
>> fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good acc_power nopl
>> nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3
>> cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c lahf_lm cmp_legacy
>> svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
>> skinit wdt topoext perfctr_nb bpext ptsc perfctr_llc cpb hw_pstate
>> ssbd vmmcall bmi1 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale
>> flushbyasid decodeassists pausefilter pfthreshold overflow_recov
>>
>> root at officelink01:~# openssl help
>> Standard commands
>> asn1parse ca ciphers cms
>> crl crl2pkcs7 dgst dhparam
>> dsa dsaparam ec ecparam
>> enc engine errstr gendsa
>> genpkey genrsa help list
>> nseq ocsp passwd pkcs12
>> pkcs7 pkcs8 pkey pkeyparam
>> pkeyutl prime rand rehash
>> req rsa rsautl s_client
>> s_server s_time sess_id smime
>> speed spkac srp storeutl
>> ts verify version x509
>>
>> Message Digest commands (see the `dgst' command for more details)
>> blake2b512 blake2s256 gost md4
>> md5 rmd160 sha1 sha224
>> sha256 sha3-224 sha3-256 sha3-384
>> sha3-512 sha384 sha512 sha512-224
>> sha512-256 shake128 shake256 sm3
>>
>> Cipher commands (see the `enc' command for more details)
>> aes-128-cbc aes-128-ecb aes-192-cbc aes-192-ecb
>> aes-256-cbc aes-256-ecb aria-128-cbc aria-128-cfb
>> aria-128-cfb1 aria-128-cfb8 aria-128-ctr aria-128-ecb
>> aria-128-ofb aria-192-cbc aria-192-cfb aria-192-cfb1
>> aria-192-cfb8 aria-192-ctr aria-192-ecb aria-192-ofb
>> aria-256-cbc aria-256-cfb aria-256-cfb1 aria-256-cfb8
>> aria-256-ctr aria-256-ecb aria-256-ofb base64
>> bf bf-cbc bf-cfb bf-ecb
>> bf-ofb camellia-128-cbc camellia-128-ecb camellia-192-cbc
>> camellia-192-ecb camellia-256-cbc camellia-256-ecb cast
>> cast-cbc cast5-cbc cast5-cfb cast5-ecb
>> cast5-ofb des des-cbc des-cfb
>> des-ecb des-ede des-ede-cbc des-ede-cfb
>> des-ede-ofb des-ede3 des-ede3-cbc des-ede3-cfb
>> des-ede3-ofb des-ofb des3 desx
>> rc2 rc2-40-cbc rc2-64-cbc rc2-cbc
>> rc2-cfb rc2-ecb rc2-ofb rc4
>> rc4-40 seed seed-cbc seed-cfb
>> seed-ecb seed-ofb sm4-cbc sm4-cfb
>> sm4-ctr sm4-ecb sm4-ofb
>>
>> root at officelink01:~# openssl speed -elapsed -evp aes-128-cbc
>> You have chosen to measure elapsed time instead of user CPU time.
>> Doing aes-128-cbc for 3s on 16 size blocks: 13905799 aes-128-cbc's in
>> 3.00s
>> Doing aes-128-cbc for 3s on 64 size blocks: 6572120 aes-128-cbc's in
>> 3.00s
>> Doing aes-128-cbc for 3s on 256 size blocks: 2254183 aes-128-cbc's in
>> 3.00s
>> Doing aes-128-cbc for 3s on 1024 size blocks: 623111 aes-128-cbc's in
>> 3.00s
>> Doing aes-128-cbc for 3s on 8192 size blocks: 80058 aes-128-cbc's in
>> 3.00s
>> Doing aes-128-cbc for 3s on 16384 size blocks: 40180 aes-128-cbc's in
>> 3.00s
>> OpenSSL 1.1.1d 10 Sep 2019
>> built on: Sat Oct 12 19:56:43 2019 UTC
>> options:bn(64,64) rc4(8x,int) des(int) aes(partial) blowfish(ptr)
>> compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall
>> -Wa,--noexecstack -g -O2
>> -fdebug-prefix-map=/build/openssl-YwazYa/openssl-1.1.1d=.
>> -fstack-protector-strong -Wformat -Werror=format-security
>> -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ
>> -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5
>> -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM
>> -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM
>> -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG
>> -Wdate-time -D_FORTIFY_SOURCE=2
>> The 'numbers' are in 1000s of bytes per second processed.
>> type 16 bytes 64 bytes 256 bytes 1024 bytes
>> 8192 bytes 16384 bytes
>> aes-128-cbc 74164.26k 140205.23k 192356.95k 212688.55k
>> 218611.71k 219436.37k
>> root at officelink01:~# openssl speed -elapsed -evp aes-256-cbc
>> You have chosen to measure elapsed time instead of user CPU time.
>> Doing aes-256-cbc for 3s on 16 size blocks: 12322268 aes-256-cbc's in
>> 3.00s
>> Doing aes-256-cbc for 3s on 64 size blocks: 5283431 aes-256-cbc's in
>> 3.00s
>> Doing aes-256-cbc for 3s on 256 size blocks: 1686231 aes-256-cbc's in
>> 3.00s
>> Doing aes-256-cbc for 3s on 1024 size blocks: 454425 aes-256-cbc's in
>> 3.00s
>> Doing aes-256-cbc for 3s on 8192 size blocks: 58092 aes-256-cbc's in
>> 3.00s
>> Doing aes-256-cbc for 3s on 16384 size blocks: 29035 aes-256-cbc's in
>> 3.00s
>> OpenSSL 1.1.1d 10 Sep 2019
>> built on: Sat Oct 12 19:56:43 2019 UTC
>> options:bn(64,64) rc4(8x,int) des(int) aes(partial) blowfish(ptr)
>> compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -Wall
>> -Wa,--noexecstack -g -O2
>> -fdebug-prefix-map=/build/openssl-YwazYa/openssl-1.1.1d=.
>> -fstack-protector-strong -Wformat -Werror=format-security
>> -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ
>> -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5
>> -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM
>> -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAESNI_ASM -DVPAES_ASM
>> -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG
>> -Wdate-time -D_FORTIFY_SOURCE=2
>> The 'numbers' are in 1000s of bytes per second processed.
>> type 16 bytes 64 bytes 256 bytes 1024 bytes
>> 8192 bytes 16384 bytes
>> aes-256-cbc 65718.76k 112713.19k 143891.71k 155110.40k
>> 158629.89k 158569.81k
>> root at officelink01:~#
>> _______________________________________________
>> tinc mailing list
>> tinc at tinc-vpn.org
>> https://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc
> _______________________________________________
> tinc mailing list
> tinc at tinc-vpn.org
> https://www.tinc-vpn.org/cgi-bin/mailman/listinfo/tinc
More information about the tinc
mailing list