Coming fresh off of this forum post: https://forums.plex.tv/t/docker-container-keeps-loosing-connection-access-db-constantly-busy/590693/66
I am wanting to see what potential issues there are going from 6.7u3 to 7.0, with vmdk's on Ubuntu 20.04. When checking and running disk latency commands on an NVME Samsung 970 Pro, and when creating new vmdks vs existing vmdks (all created within esxi 7.0) there seems to be a drastic bandwidth difference. '
All of the above done via the guest VM
For instance:
@apollo:~$ sudo fdisk /dev/sdd Command (m for help): n Select (default p): Using default response p. Partition number (1-4, default 1): First sector (2048-134217727, default 2048): Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-134217727, default 134217727): Created a new partition 1 of type 'Linux' and of size 64 GiB. Command (m for help): w The partition table has been altered. Calling ioctl() to re-read partition table. Syncing disks. andrew@apollo:~$ sudo mkfs.ext4 /dev/sdd mke2fs 1.45.5 (07-Jan-2020) Found a dos partition table in /dev/sdd Proceed anyway? (y,N) y Creating filesystem with 16777216 4k blocks and 4194304 inodes Filesystem UUID: 500830b1-a52c-4aba-ad7f-c4b012d7278d Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424 Allocating group tables: done Writing inode tables: done Creating journal (131072 blocks): done Writing superblocks and filesystem accounting information: done andrew@apollo:~$ sudo dd if=/dev/sdd of=/dev/null bs=1M 65536+0 records in 65536+0 records out 68719476736 bytes (69 GB, 64 GiB) copied, 19.0623 s, 3.6 GB/s andrew@apollo:~$ sudo dd if=/dev/sda of=/dev/null bs=1M 24576+0 records in 24576+0 records out 25769803776 bytes (26 GB, 24 GiB) copied, 29.965 s, 860 MB/s ed, 29.965 s, 860 MB/s
We can see there is a huge difference when doing the running drive compared to an empty disk. Even when running it against a semi empty disk (<14GB of data). In addition it took 3 minutes to write a 12 GB test file from `/dev/urandom` to the /mnt/plex (sdb1):
@apollo:/mnt/plex$ sudo dd if=/dev/urandom of=/mnt/plex/testfile.fake bs=1M count=105MB [sudo] password for : 11117+0 records in 11116+0 records out 11655970816 bytes (12 GB, 11 GiB) copied, 183.181 s, 63.6 MB/s
That seems like a ridiculously low transaction speed, and when running hdparm -Tt, on two different vmdk located on the same NVME ssd:
@apollo:~$ sudo hdparm -Tt /dev/sdb /dev/sdb: Timing cached reads: 19262 MB in 2.00 seconds = 9651.45 MB/sec SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Timing buffered disk reads: 2706 MB in 3.00 seconds = 901.96 MB/sec andrew@apollo:~$ sudo hdparm -Tt /dev/sdb /dev/sdb: Timing cached reads: 20236 MB in 2.00 seconds = 10141.30 MB/sec SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Timing buffered disk reads: 2738 MB in 3.00 seconds = 912.12 MB/sec andrew@apollo:~$ sudo hdparm -Tt /dev/sdc1 /dev/sdc1: Timing cached reads: 19366 MB in 2.00 seconds = 9703.55 MB/sec SG_IO: bad/missing sense data, sb[]: 70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Timing buffered disk reads: 9146 MB in 3.00 seconds = 3047.93 MB/sec andrew@apollo:~$ sudo hdparm -Tt /dev/sda1
All of this is on Ubuntu 20.04, so I am wondering if there is some issue with esxi 7.0, on throughput, vmdk creations, tps, etc.
If there is a better way to debug this at the esxi level and what I can do.
To confirm, this is all been done on:
t10.NVMe____Samsung_SSD_970_PRO_1TB_________________EA43B39156382500, partition 1
Version: "AMD Ryzen 7 1700 Eight-Core Processor"
64 GB DDR4 ECC memory
X470 Motherboard
The guest VM's settings are:
I was going to try and downgrade back to esxi 6.7u3, but it seems like the upgrade to 7.0 wipes out the bootbank's 6.7 config:
[root@esxi2:~] tail -2 /*bootbank/boot.cfg ==> /altbootbank/boot.cfg <== build=7.0.0-1.0.15843807 updated=5 ==> /bootbank/boot.cfg <== build=7.0.0-1.0.15843807 updated=4
So my question is at this point, should I load esxi 6.7u3 back onto a USB stick and boot from there for the time being? I was wanting to test k8s/kubernetes on vsphere/esxi, but, I haven't been able to figure out how to do it on a single esxi host. But would rather not endure slowdown, throughput, or tps issues on vmdks randomly like above.