Results 1 to 10 of 46

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    As some of you have noticed the server still feels sluggish from time to time. I think I may have found the culprit. It's Dynmap! We have Dynmap running on our TFC servers, 4 in total. Those generate a ton of updates because of TFC and it's huge amount of block updates.

    Take a look at this:
    Name:  Dynmap Disk IO.png
Views: 771
Size:  139.4 KB

    What you see here are statistics for our WD Purple hard drive which is solely used to store Dynmap tiles. It does nothing else. There's no raid, just a single drive. As you can see the system is sending up to 1600 IOPS (input/output operations per second) to that drive. As a rule of thumb a regular hard drive can only do about 100 IOPS.

    The reason this is slowing down the rest of the server is that, while it is a single drive, it is connected to the same raid controller as the rest of the hard drives. When a drive can't keep up with the requested amount of IOPS, a bunch of these get queued. The raid controller has a queue depth of 255. When that queue gets saturated, IOPS meant for other raid arrays are put back in line and need to compete on a flooded I/O path.

    What I'm going to do is move the WD Purple drive to the onboard SATA controller. Since it doesn't use raid anyway, Dynmap can saturate that SATA controller all it wants to. If I'm right about this we should see lower latency for all other raid volumes meaning everything should run smoother.

    I'm probably going to do this somewhere tonight. So if the server's down, that's why

  2. #2
    As it turns out, the onboard SATA controller isn't supported by VMWare ESXi. So instead I did the next best thing, I limited the maximum amount of IOPS the virtual machine could send to the WD Purple drive. This resulted in the graphs below:
    Name:  Dynmap Disk IO after rate limiting.png
Views: 2926
Size:  130.2 KB

    So it respects the hard limit of 100 IOPS. And as predicted the latency for all the other raid arrays has gone done significantly making everything feel snappy again.

    Now because the drive is being limited to 100 IOPS, which is still a crazy amount, Dynmap for the TFC servers may feel a bit sluggish. Sometimes it takes a while before it loads the tiles. Just give it a moment and it'll eventually load everything.

    Next up on our agenda is trying to figure out why exactly Dynmap is generating such a crazy amount of IOPS. It's only doing a few KB/s. So I'm not sure what's going on there. Sure TFC does a huge amount of updates, but even this is far beyond what I'd expect.

  3. #3
    As some of you may know we had a bit of a snafu the other week. We run several virtual machines on our server to mitigate problems should one of those VMs suddenly decide to stop working or get hacked or die or destroy it's filesystem... Which is exactly what happened!

    We had two 3TB SATA drives in raid1 which was used as storage for the Minecraft VM and the email VM. Back then, the web server was also placed on the Minecraft VM. Then the ext4 file system of the Minecraft server died when I tried cleaning up some old ARK server instances (which have been running in their own VM for a really long time now). A disk check recovered most of the files and directories, but but them all in lost+found which means each directory and file got placed out of context. Instead of having a path like:
    Code:
    /dir_A/sub_dir_1
    /dir_A/sub_dir_2
    /dir_A/sub_dir_3
    /dir_B/sub_dir_1
    /dir_B/sub_dir_2
    /dir_B/sub_dir_3
    etc
    We got this:
    Code:
    /random_dir/files
    /random_sub_dir/files
    /random_sub_dir/files
    ect
    We do have backups of the email and Minecraft servers so it's not too bad. But I didn't want to take any chances and replace the two 3TB SATA drives. I've bought two 6TB SAS drives.
    Seagate Enterprise capacity wdbmma0060hnc-ersn 3.5 HDD 7200rpm SAS 12 GB/S 256 MB cache 8,9 cm 3,5 inch 24 x 7 512 Native BLK
    Name:  Seagate Enterprise drive.jpg
Views: 987
Size:  38.0 KB
    2x EUR 240,93

    While waiting for my order to arrive I set up a new VM dedicated as just a web server and restored the email server. I decided to go with Ubuntu 18.04 as we ran Ubuntu 16.04 before which was fine. Unfortunately Ubuntu 18.04 seemed to hang every few days. No crashes, it just froze with 1 CPU core at 100% load. So I decided to upgrade VMWare as well which is a bit of a hassle. For this to work I had to create a custom ISO of the VMWare ESXi installer that had drivers on it for our raid controller. Having done that I pulled the USB stick with the old VMWare ESXi installation on it. Then created a boot USB stick and put in a different stick in the server to install to. Installation failed a few times. So... I swapped installer USB disk and destination stick. Installation still failed. I dug around and found another USB stick to install ESXi on. It's a Sandisk 32GB stick. One of those really tiny ones that are barely larger than the USB plug itself. Installation succeeded! Then create virtual switches, import all the remaining VMs, set them to auto-start and we've been running happily ever since.

    The new 6TB SAS drives arrived, I put them in the server. And nothing. The drives weren't detected by the raid controller, at all. It seems the controller is too old, or the drives lack proper backward compatibility, whatever. It doesn't work.

    So I've also ordered a new SAS raid controller:
    Fujitsu SAS Controller psas cp400i 12 GB/S 8 poort based on LSI sas3008
    Name:  Fujitsu SAS Controller psas cp400i.jpg
Views: 727
Size:  14.8 KB
    EUR 216,72

    Then added some cables and decided to also replace the case fans, 5 total, as those have been running 24/7 for the past 7 years or so.

    The new raid controller should arrive this week. Assuming it works and we can get a working raid1 array with the new 6TB SAS drives, I can then create a new VM for Minecraft, restore the back-ups and get that part up and running again.

    So yea, snafu

  4. #4
    Quote Originally Posted by InsaneJ View Post
    So yea, snafu
    EUR 698.58 + unlisted amounts in cables and other parts for maintenance. More often than not J is backing the majority, if not all of the costs. We appreciate all of our donors that help

  5. #5
    Oh my, i hope everything will go well as planned.

  6. #6
    Quote Originally Posted by Vikusha View Post
    Oh my, i hope everything will go well as planned.
    It didn't.

    I got the new raid controller today and installed it in the server. Replaced the fans while I was at it. Turned on the server, raid controller was detected, but it didn't detect the two SAS drives.

    Checking further it doesn't seem the drives are spinning up. So far I've used two different sets of cables and a convertor block. None of those seem to work. So either both drives are DOA, there's a problem with the power supply (not likely since currently the server is running 11 other hard drives both SAS and SATA), or these Seagate drives need something else entirely.

    First thing I tried was a simple adapter block:
    5Gbps SFF 8482 SAS to SATA 180 Degree Angle Adapter Converter Straight Head
    Spoiler!


    Then I tried a dedicted cable thinking perhaps the drives weren't detected because I mixed SAS and SATA on 1 SFF8087 connector (which splits to 4 SATA connectors). This is the cable:
    StarTech. com sas808782p50 Intern Serial-attached SCSI Mini SAS kabels – SF-8087 op 4 x SFF-8482 50 cm
    Spoiler!


    When that didn't work either I tried another raid controller with it's own cable. This one:
    CableDeconn SFF-8643 interne Mini SAS HD auf (4) 29pin SFF-8482 Stecker mit SAS 15pol Power Port 12 GB/s Kabel 1M
    Spoiler!


    Perhaps it's the SATA power to 15-pin SAS power that's causing the problems. Not sure why it would since it's pretty straightforward. Remember we're already running four 300GB SAS drives without problems. But since the new drives aren't spinning up I'm leaning in that direction right now.

    If anyone has any bright ideas, I'm all ears :B

  7. #7

  8. #8
    Quote Originally Posted by Jiro_89 View Post
    EUR 698.58 + unlisted amounts in cables and other parts for maintenance. More often than not J is backing the majority, if not all of the costs. We appreciate all of our donors that help
    Well, it looks like Jiro is planning on doing something similar. Not sure if I'm speaking out of turn here. Jiro, if you don't like me talking about it just yet, just remove my post

    Anyway. The ideas that are floating around are something like this:
    Jiro is going to buy the current server's motherboard + CPU + CPU cooler + 64GB RAM + two 1.5TB hard drives from me. With those funds I'm going to upgrade our current server to an AMD Threadripper system. Then the parts are going to move around the world and will be placed into a machine that Jiro is going to host in the US. This effectively means we'll be moving the ARK servers to the US where most of our players are from (I think). While we're not getting more RAM for our servers overall right away, this does allow room for growth.

    Another plan floating around is that Sverf is going to be buying a new motherboard + CPU for his server that currently hosts the vanilla Minecraft servers. I'll be donating 32GB of RAM, a 24-port raid controller + cables and two 3TB drives for that build.

    The Jiro-plan is going to cost around 750 euro. Sverf will be spending around 350 euro. So if you're feeling generous you know where the donate button is

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •