PDA

View Full Version : Server hardware stuff



InsaneJ
26th March 2017, 15:14
In this thread I'll try to keep track of what goes on with the server hardware wise.

The last publicly documented server change was this: [completed] New server plans (These are no longer 'new' plans) (https://happydiggers.net/showthread.php?1856-completed-New-server-plans-(These-are-no-longer-new-plans))
We upgraded the server to a 6-core/12-thread Intel Core i7 5820K (http://ark.intel.com/nl/products/82932/Intel-Core-i7-5820K-Processor-15M-Cache-up-to-3_60-GHz) CPU with 64GB of RAM.

Unfortunately that setup gave us trouble running VMWare ESXi which resulted in an unstable server. It took a few weeks to track down the exact cause. The problem was with the CPU. If you ever do any PC building: it's almost never the CPU that's causing stability issues. Unless you do overclocking. But that isn't the case here.

I decided to upgrade the CPU to a 14-core/28-thread Intel Xeon E5 2680 v4 (http://ark.intel.com/products/91754/Intel-Xeon-Processor-E5-2680-v4-35M-Cache-2_40-GHz) CPU with 96GB RAM after that. This is what we are currently running all our servers on. That took care of the stability issues.

Then the next issue was with server performance. Or to be more precise: disk performance. The servers are running of two 7200 rpm SATA drives. And even though they are connected to an Areca 1680i raid controller with 4GB cache and a dual core 1.2GHz PowerPC cpu, it's not enough to run all the additional servers we're now running. It used to be:

website
email
Minecraft, 10 or so instances.

And now we added:

ARK Survival Evolved, 3 modded instances.

Disk I/O was lagging behind and that caused some noticeable performance issues.

So I purchased an Icy dock Tough Armor (https://www.amazon.com/DOCK-ToughArmor-MB994SP-4S-6Gbps-Mobile/dp/B0040Z924Q) 4 x 2.5" mobile rack for 1 x 5.25" device bay. In this dock, I have placed three second hand 300GB 10K rpm SAS drives. These disks are meant to offset the disk I/O that's been hammering the OS drives. I also swapped out the four 80GB Intel Postville SSDs and replaced those with two new Samsung 850 Pro 256GB SSDs (https://www.amazon.com/Samsung-850-PRO-2-5-Inch-MZ-7KE256BW/dp/B00LMXBOP4/ref=sr_1_1?s=electronics&ie=UTF8&qid=1490534283&sr=1-1&keywords=samsung+850+pro+256gb).

The additional drives worked well for a few weeks. Unfortunately this morning at around 5:02AM one of those three drives failed. This is not a big deal since they were running in raid-5. However it does mean I now have to move the virtual machines that were running on those drives back to the other disks they were on before. Which means we may experience some slow downs in the time to come.

The 10K rpm SAS drives were second hand with no warranty so the faulty drive will have to be replaced by buying another. We just had a beautiful baby girl (https://happydiggers.net/showthread.php?2431-Diapers-are-cheaper-in-bulk) and with all the stuff we need for that I'm not allowed (haha :)) to spend more money on my hobby. So if anyone wants to help out by donating (see the front page). Those SAS drives cost about $55 each:
HP 300GB 6G SAS 10K 2.5 inch (https://www.amazon.com/HP-300GB-2-5-inch-Enterprise-507127-B21/dp/B0025B0EUS?ref_=nav_signin&)

At any rate the servers will continue to run. The ARK servers are running on two Samsung 850 PRO SSDs. It's just everything else that will get a performance hit now that they have to share the slower storage.

Rainnmannx
26th March 2017, 16:45
Would this lower tps on the freebietfc server? It was around 18 this morning with 2 on, and around 14 now with 4 on.
If it is I will let people know if they ask about it. And this in no way a push to get it fixed, family is first :).

InsaneJ
26th March 2017, 17:03
I'm not sure if TPS will drop when loading chunks takes a bit longer. It might.

As for fixing the issue, that won't take much time. If/when I have the funds it's just a matter of ordering a drive. Then pull the defective drive from the server and replace it with the new one. The raid controller will then start rebuilding the array automatically. After that is done I'll move the virtual machines back to the sas drives. All in all it won't take more than half an hour of button pushing. The rebuilding and virtual machine moving will take longer but that's just a progress bar filling up :)

Rainnmannx
27th March 2017, 00:22
Is there anything I can do to check what may be causing the tps drop?
I used /lag and the entities where around 4000, below the 10k number I have seen on the message.
Also at one point mem had 646 or so remaining.

/lag
now shows
2065 chunks, 990 entities, 126,537 tiles, looks like it reset about 3 hours ago. and is back to 19.87 tps
though the one on the tab screen show a different number of about 4, bukkit tps?

InsaneJ
27th March 2017, 11:04
/lag shows TPS from the Bukkit side of the server, same as the value shown when pressing TAB. Use /forge tps instead to get a more accurate reading on how the server is doing. Bukkit TPS tends to always be slightly lower than the Forge TPS and less than 20 (19,xx). To me it looks like one sits on top of the other since Forge and Forge mods seem to take precedence to Bukkit and Bukkit plugins.

The TAB TPS being low was due to a glitch in BungeeCord and the plugin that takes care of that information. I've restarted BungeeCord and now the 'Bukkit TPS' in the TAB screen is similar to that when you type /lag or /tps

3197

3198

Bottom pic shows output of:
/lag
/tps
/forge tps

InsaneJ
27th March 2017, 11:49
I've received three donations last night. Thanks guys!
With this I'm going to purchase two new drives. Unfortunately the drives I linked above won't ship to The Netherlands. Buying them here (European Amazon) they are 89.98 euro (https://www.amazon.de/Ersatzteil-300Gb-Plug-5Zoll-507127-B21/dp/B008M10K6I/ref=sr_1_1?ie=UTF8&qid=1490607563&sr=8-1&keywords=hp+300gb+sas+10k+2.5). Which means to avoid an angry wife we could really use another donation or two :)

The two drives will replace the faulty one and expand the array to give us a net storage of 900GB with more I/O. (Raid-5 capacity = n-1, meaning: 4x300 - 300) The drives should arrive in a few days.

InsaneJ
30th March 2017, 15:39
The new hard drives were delivered today. I put them in the server and, although this is something for which the server can remain on, it went down anyway. Reason being my 2 year old who saw a shiny power button and just had to push it. I still haven't figured out how I can configure ESXi to ignore the power button. So... apologies for the unscheduled down time :pig:

The new drives have been put in a raid-5 array and it's currently initializing. We're growing from 600 to 900GB (4x300 - 300) and when that's done I'll start moving virtual machines back to this array. After that server performance should be back to normal.

Sverf
2nd April 2017, 13:19
Maybe you can borrow the dance dance authentication from stackoverflow :p:p:p

InsaneJ
14th June 2017, 20:39
As some of you have noticed the server still feels sluggish from time to time. I think I may have found the culprit. It's Dynmap! We have Dynmap running on our TFC servers, 4 in total. Those generate a ton of updates because of TFC and it's huge amount of block updates.

Take a look at this:
3324

What you see here are statistics for our WD Purple hard drive which is solely used to store Dynmap tiles. It does nothing else. There's no raid, just a single drive. As you can see the system is sending up to 1600 IOPS (input/output operations per second) to that drive. As a rule of thumb a regular hard drive can only do about 100 IOPS.

The reason this is slowing down the rest of the server is that, while it is a single drive, it is connected to the same raid controller as the rest of the hard drives. When a drive can't keep up with the requested amount of IOPS, a bunch of these get queued. The raid controller has a queue depth of 255. When that queue gets saturated, IOPS meant for other raid arrays are put back in line and need to compete on a flooded I/O path.

What I'm going to do is move the WD Purple drive to the onboard SATA controller. Since it doesn't use raid anyway, Dynmap can saturate that SATA controller all it wants to. If I'm right about this we should see lower latency for all other raid volumes meaning everything should run smoother.

I'm probably going to do this somewhere tonight. So if the server's down, that's why :)

InsaneJ
14th June 2017, 23:21
As it turns out, the onboard SATA controller isn't supported by VMWare ESXi. So instead I did the next best thing, I limited the maximum amount of IOPS the virtual machine could send to the WD Purple drive. This resulted in the graphs below:
3325

So it respects the hard limit of 100 IOPS. And as predicted the latency for all the other raid arrays has gone done significantly making everything feel snappy again.

Now because the drive is being limited to 100 IOPS, which is still a crazy amount, Dynmap for the TFC servers may feel a bit sluggish. Sometimes it takes a while before it loads the tiles. Just give it a moment and it'll eventually load everything.

Next up on our agenda is trying to figure out why exactly Dynmap is generating such a crazy amount of IOPS. It's only doing a few KB/s. So I'm not sure what's going on there. Sure TFC does a huge amount of updates, but even this is far beyond what I'd expect.

InsaneJ
11th September 2018, 14:20
As some of you may know we had a bit of a snafu the other week. We run several virtual machines on our server to mitigate problems should one of those VMs suddenly decide to stop working or get hacked or die or destroy it's filesystem... Which is exactly what happened!

We had two 3TB SATA drives in raid1 which was used as storage for the Minecraft VM and the email VM. Back then, the web server was also placed on the Minecraft VM. Then the ext4 file system of the Minecraft server died when I tried cleaning up some old ARK server instances (which have been running in their own VM for a really long time now). A disk check recovered most of the files and directories, but but them all in lost+found which means each directory and file got placed out of context. Instead of having a path like:

/dir_A/sub_dir_1
/dir_A/sub_dir_2
/dir_A/sub_dir_3
/dir_B/sub_dir_1
/dir_B/sub_dir_2
/dir_B/sub_dir_3
etc

We got this:

/random_dir/files
/random_sub_dir/files
/random_sub_dir/files
ect

We do have backups of the email and Minecraft servers so it's not too bad. But I didn't want to take any chances and replace the two 3TB SATA drives. I've bought two 6TB SAS drives.
Seagate Enterprise capacity wdbmma0060hnc-ersn 3.5 HDD 7200rpm SAS 12 GB/S 256 MB cache 8,9 cm 3,5 inch 24 x 7 512 Native BLK (https://www.amazon.de/gp/product/B01N48L92T/ref=oh_aui_detailpage_o02_s00?ie=UTF8&psc=1)
3717
2x EUR 240,93

While waiting for my order to arrive I set up a new VM dedicated as just a web server and restored the email server. I decided to go with Ubuntu 18.04 as we ran Ubuntu 16.04 before which was fine. Unfortunately Ubuntu 18.04 seemed to hang every few days. No crashes, it just froze with 1 CPU core at 100% load. So I decided to upgrade VMWare as well which is a bit of a hassle. For this to work I had to create a custom ISO of the VMWare ESXi installer that had drivers on it for our raid controller. Having done that I pulled the USB stick with the old VMWare ESXi installation on it. Then created a boot USB stick and put in a different stick in the server to install to. Installation failed a few times. So... I swapped installer USB disk and destination stick. Installation still failed. I dug around and found another USB stick to install ESXi on. It's a Sandisk 32GB stick. One of those really tiny ones that are barely larger than the USB plug itself. Installation succeeded! Then create virtual switches, import all the remaining VMs, set them to auto-start and we've been running happily ever since.

The new 6TB SAS drives arrived, I put them in the server. And nothing. The drives weren't detected by the raid controller, at all. It seems the controller is too old, or the drives lack proper backward compatibility, whatever. It doesn't work.

So I've also ordered a new SAS raid controller:
Fujitsu SAS Controller psas cp400i 12 GB/S 8 poort based on LSI sas3008 (https://www.amazon.de/gp/product/B00UQSKF7O/ref=oh_aui_detailpage_o00_s01?ie=UTF8&psc=1)
3718
EUR 216,72

Then added some cables (https://www.amazon.de/gp/product/B07B9SBSVW/ref=oh_aui_detailpage_o00_s00?ie=UTF8&psc=1) and decided to also replace the case fans (https://www.amazon.de/gp/product/B01J76IYL4/ref=oh_aui_detailpage_o00_s00?ie=UTF8&psc=1), 5 total, as those have been running 24/7 for the past 7 years or so.

The new raid controller should arrive this week. Assuming it works and we can get a working raid1 array with the new 6TB SAS drives, I can then create a new VM for Minecraft, restore the back-ups and get that part up and running again.

So yea, snafu (https://www.urbandictionary.com/define.php?term=SNAFU) :)

Jiro_89
12th September 2018, 01:02
So yea, snafu (https://www.urbandictionary.com/define.php?term=SNAFU) :)

EUR 698.58 + unlisted amounts in cables and other parts for maintenance. More often than not J is backing the majority, if not all of the costs. We appreciate all of our donors that help :)

Vikusha
13th September 2018, 01:38
Oh my, i hope everything will go well as planned.

InsaneJ
14th September 2018, 15:15
Oh my, i hope everything will go well as planned.
It didn't.

I got the new raid controller today and installed it in the server. Replaced the fans while I was at it. Turned on the server, raid controller was detected, but it didn't detect the two SAS drives.

Checking further it doesn't seem the drives are spinning up. So far I've used two different sets of cables and a convertor block. None of those seem to work. So either both drives are DOA, there's a problem with the power supply (not likely since currently the server is running 11 other hard drives both SAS and SATA), or these Seagate drives need something else entirely.

First thing I tried was a simple adapter block:
5Gbps SFF 8482 SAS to SATA 180 Degree Angle Adapter Converter Straight Head
3723
3724

Then I tried a dedicted cable thinking perhaps the drives weren't detected because I mixed SAS and SATA on 1 SFF8087 connector (which splits to 4 SATA connectors). This is the cable:
StarTech. com sas808782p50 Intern Serial-attached SCSI Mini SAS kabels – SF-8087 op 4 x SFF-8482 50 cm
3721

When that didn't work either I tried another raid controller with it's own cable. This one:
CableDeconn SFF-8643 interne Mini SAS HD auf (4) 29pin SFF-8482 Stecker mit SAS 15pol Power Port 12 GB/s Kabel 1M
3722

Perhaps it's the SATA power to 15-pin SAS power that's causing the problems. Not sure why it would since it's pretty straightforward. Remember we're already running four 300GB SAS drives without problems. But since the new drives aren't spinning up I'm leaning in that direction right now.

If anyone has any bright ideas, I'm all ears :B

Vikusha
15th September 2018, 05:50
Aww. :(

InsaneJ
15th September 2018, 17:28
Svert linked me a forum thread where someone experienced something similar. In the end the solution was to use silver cables forged by elves in moonlight. So I've ordered another set of cables:
Adaptec ACK-I-HDmSAS-4SAS-SB-.8M
3725
We'll see if that makes any difference.

InsaneJ
20th September 2018, 15:18
The new cable works!

I plugged it in just now. The drives spun up and are recognized by the raid controller. I'll try doing the raid setup and VM stuff tonight so we can get the TFC servers up and running.

Elbe97
20th September 2018, 16:46
So the silver cable forged by Elves in moonlight worked :) Awesome! Good news for TFC :)

InsaneJ
22nd October 2018, 09:21
EUR 698.58 + unlisted amounts in cables and other parts for maintenance. More often than not J is backing the majority, if not all of the costs. We appreciate all of our donors that help :)
Well, it looks like Jiro is planning on doing something similar. Not sure if I'm speaking out of turn here. Jiro, if you don't like me talking about it just yet, just remove my post ;)

Anyway. The ideas that are floating around are something like this:
Jiro is going to buy the current server's motherboard + CPU + CPU cooler + 64GB RAM + two 1.5TB hard drives from me. With those funds I'm going to upgrade our current server to an AMD Threadripper system. Then the parts are going to move around the world and will be placed into a machine that Jiro is going to host in the US. This effectively means we'll be moving the ARK servers to the US where most of our players are from (I think). While we're not getting more RAM for our servers overall right away, this does allow room for growth.

Another plan floating around is that Sverf is going to be buying a new motherboard + CPU for his server that currently hosts the vanilla Minecraft servers. I'll be donating 32GB of RAM, a 24-port raid controller + cables and two 3TB drives for that build.

The Jiro-plan is going to cost around 750 euro. Sverf will be spending around 350 euro. So if you're feeling generous you know where the donate button is :)

bram_dc
22nd October 2018, 18:30
In this thread I'll try to keep track of what goes on with the server hardware wise.

The last publicly documented server change was this: [completed] New server plans (These are no longer 'new' plans) (https://happydiggers.net/showthread.php?1856-completed-New-server-plans-(These-are-no-longer-new-plans))
We upgraded the server to a 6-core/12-thread Intel Core i7 5820K (http://ark.intel.com/nl/products/82932/Intel-Core-i7-5820K-Processor-15M-Cache-up-to-3_60-GHz) CPU with 64GB of RAM.

Unfortunately that setup gave us trouble running VMWare ESXi which resulted in an unstable server. It took a few weeks to track down the exact cause. The problem was with the CPU. If you ever do any PC building: it's almost never the CPU that's causing stability issues. Unless you do overclocking. But that isn't the case here.

I decided to upgrade the CPU to a 14-core/28-thread Intel Xeon E5 2680 v4 (http://ark.intel.com/products/91754/Intel-Xeon-Processor-E5-2680-v4-35M-Cache-2_40-GHz) CPU with 96GB RAM after that. This is what we are currently running all our servers on. That took care of the stability issues.

Then the next issue was with server performance. Or to be more precise: disk performance. The servers are running of two 7200 rpm SATA drives. And even though they are connected to an Areca 1680i raid controller with 4GB cache and a dual core 1.2GHz PowerPC cpu, it's not enough to run all the additional servers we're now running. It used to be:

website
email
Minecraft, 10 or so instances.

And now we added:

ARK Survival Evolved, 3 modded instances.

Disk I/O was lagging behind and that caused some noticeable performance issues.

So I purchased an Icy dock Tough Armor (https://www.amazon.com/DOCK-ToughArmor-MB994SP-4S-6Gbps-Mobile/dp/B0040Z924Q) 4 x 2.5" mobile rack for 1 x 5.25" device bay. In this dock, I have placed three second hand 300GB 10K rpm SAS drives. These disks are meant to offset the disk I/O that's been hammering the OS drives. I also swapped out the four 80GB Intel Postville SSDs and replaced those with two new Samsung 850 Pro 256GB SSDs (https://www.amazon.com/Samsung-850-PRO-2-5-Inch-MZ-7KE256BW/dp/B00LMXBOP4/ref=sr_1_1?s=electronics&ie=UTF8&qid=1490534283&sr=1-1&keywords=samsung+850+pro+256gb).

The additional drives worked well for a few weeks. Unfortunately this morning at around 5:02AM one of those three drives failed. This is not a big deal since they were running in raid-5. However it does mean I now have to move the virtual machines that were running on those drives back to the other disks they were on before. Which means we may experience some slow downs in the time to come.

The 10K rpm SAS drives were second hand with no warranty so the faulty drive will have to be replaced by buying another. We just had a beautiful baby girl (https://happydiggers.net/showthread.php?2431-Diapers-are-cheaper-in-bulk) and with all the stuff we need for that I'm not allowed (haha :)) to spend more money on my hobby. So if anyone wants to help out by donating (see the front page). Those SAS drives cost about $55 each:
HP 300GB 6G SAS 10K 2.5 inch (https://www.amazon.com/HP-300GB-2-5-inch-Enterprise-507127-B21/dp/B0025B0EUS?ref_=nav_signin&)

At any rate the servers will continue to run. The ARK servers are running on two Samsung 850 PRO SSDs. It's just everything else that will get a performance hit now that they have to share the slower storage.

Hi InsaneJ,
I have a small job as web developper and realised I got a not in use 860 evo 256gb laying around, which I can sell to you for cheap. If i'm right you live in the netherlands aswell. I would also suggest to run the website on a cheap hosting service to make some space on the ssd's for ark and mc. Website hosting is really cheap atm. I can also host something for you for free if youre intrested. So contact me if you want anything,
Greetings Bram.

InsaneJ
23rd October 2018, 17:02
Hi InsaneJ,
I have a small job as web developper and realised I got a not in use 860 evo 256gb laying around, which I can sell to you for cheap. If i'm right you live in the netherlands aswell. I would also suggest to run the website on a cheap hosting service to make some space on the ssd's for ark and mc. Website hosting is really cheap atm. I can also host something for you for free if youre intrested. So contact me if you want anything,
Greetings Bram.
Thanks for your offer, Bram :)

Right now we have no shortage in disk storage. Recently I've upgraded our server with two 6TB SAS drives. In addition to four 300GB 10K SAS drives, two 256GB EVO PRO SSDs, six 3TB WD Red drives and a 1TB nvme SSD for caching we have plenty of space to put everything.

Compared to hosting ARK and heavily modded Minecraft servers, the web server doesn't use up a whole lot of resources. We upgraded our server to 128GB of RAM a short while ago. Right now that is sufficient to run all the servers we want.

All those figures combined add up to quite a large sum of money. But compared to having to rent servers, it's cheap. Also it's a bit of a hobby :)

InsaneJ
22nd November 2018, 22:05
Some of you may know about our crazy plans to setup another HappyDiggers server in the US. That plan seems to be going forward. Jiro bought my current 14core Xeon, motherboard and 64GB of RAM. This allowed me to buy the following parts to upgrade our current server with:



AMD Ryzen Threadripper 1920X 12 core / 14 thread - CPU
€ 335,00


ASRock X399 TAICHI - Motherboard
€ 348,33


Noctua NH-U14S - CPU cooler
€ 79,95


Corsair RMx Series RM850x (2018) - power supply
€ 118,00


USB drive
€ 15,99



I put together these components along with 32GB of RAM and an old Nvidia Quadro card I had lying around. Unfortunately the Quadro shorted and let the smoke out. Yikes...

After that I removed it from the motherboard and booted it up. Oddly enough it displayed the all-OK post codes and the num lock LED on the keyboard seemed to turn on/off when pressing the num lock key. So it seems that apart from the Quadro the other hardware is fine.

I don't have any other graphics cards lying around to test with and I don't feel like taking one out of our HTPC. So I've bought an ATI-102-B17002(B) 256MB PCI-e x1 card for 2 euro. That's right. ATI. From way back before it was bought by AMD. Essentially all it has to do is display the BIOS and a VMWare terminal. If I could have found a 4MB non-3D card that fits in a PCI-e x1 slot I would have gotten that. But this card should do nicely :)

Once I get the 'new' graphics card I'll try installing VMWare and do some testing. If that all goes well I'll do the server upgrade after that. This involves removing the current motherboard and expansion cards from the server case and replacing it with the new parts. This shouldn't take too long. After that I'll have to do some work on VMWare to get all the VMs up and running again.

I won't send the parts to Jiro after that just yet. He asked me to wait until he can upgrade his Internet connection to allow for more upstream bandwidth. This means the current server will continue to run with 128GB of memory, but with a faster CPU.

Pics!

AMD Ryzen Threadripper 1920x
3769

Asrock X399 Taichi
3770

Noctua NH-U14S
3771

Corsair RMx Series RM850x
3772

ATI-102-B17002(B)
3773

Marius49
23rd November 2018, 17:19
I think I had at one point an ATI card with specs similar to your "new" one. It had the same fate as your old Quadro :p

InsaneJ
23rd November 2018, 20:38
I received the 'new' ATI card in the mail today. Installed it. Works!

I've installed VMWare ESXi on the USB drive and so far things are looking good. If all goes well I may upgrade the server somewhere this weekend. I'm not sure what day and time though. You'll know when everything is down. That's me upgrading the server ;)

InsaneJ
9th December 2018, 09:25
Yesterday Sverf and I upgraded Sverf's server. The plan was to replace the motherboard / CPU / RAM with new parts and to put in a fancy raid controller and two extra 3TB hard drives. The first part worked, the raid controller part not so much.

Sverf's server has gone from an Intel Xeon X5650 (https://ark.intel.com/products/47922/Intel-Xeon-Processor-X5650-12M-Cache-2-66-GHz-6-40-GT-s-Intel-QPI-) 6-core / 12-thread CPU to an AMD Ryzen 5 2600 (https://www.amd.com/en/products/cpu/amd-ryzen-5-2600) 6-core / 12-thread CPU. The memory was upgraded from 24GB DDR-3 to 32GB DDR-4. And a quad-port HP Intel server network card was installed.

The 24-port Areca 1280ML raid controller we wanted to install seemed to be broken so we had to skip that step. That also meant that we couldn't change the system from running Xen virtualization to VMWare. After we had given up on the raid controller the rest of the upgrade went pretty smooth. Swapping the hardware was pretty easy and after that Sverf did some fancy Linux stuff which goes well beyond clicking next-next-finish in the Ubuntu graphical installer ;)

The following hardware was installed:


AMD Ryzen 5 2600 CPU
€ 168,95


ASRock B450 Pro4 Motherboard
€ 86,90


32GB HyperX RAM
€ 275


Zotac ZT-71304-20L
€ 58,65


HP Intel Quad-port server network card
€ 70



Pics!

AMD Ryzen 5 2600
3777

ASRock B450 Pro4
3778

HyperX RAM
3779

Zotac ZT-71304-20L
3780

HP Intel Quad-port server network card
3781


Sverf's server currently hosts the HappyDiggers MC, Vanilla, Snapshot and FTB servers. Before the upgrade the FTB server which currently runs the HappyDigers AMP 4.x mod pack was sitting on a mean tick time of about 55~60ms. This means the server was slightly lagging and with the amount of mods in AMP that becomes very noticeable. Right now the FTB server has a tick time of about 20~30ms. This means performance has roughly doubled going from the Xeon to the Ryzen. Of course to be fair the Xeon was getting really old at this point as it was from the first generation Core series. Still, it's a nice upgrade :)

Jiro_89
9th December 2018, 17:30
3782

InsaneJ
14th December 2018, 19:39
This afternoon one of the four 300GB 10K rpm SAS drives failed. These drives run in raid-5 so a single drive failure does not mean any data is lost. I've moved the VMs running on this array to the new 6TB SAS drives and replaced the faulty drive. The array is now rebuilding.

Nothing fancy. Nothing to worry about. Just another drive failure. The replacement cost about 50 euro.

InsaneJ
19th March 2019, 22:59
Last night the server crashed. Turns out the Areca raid controller is slowly going bad. After restarting the server the controller wasn't recognized any longer. After another restart the card re-appeared.

As was announced on the front page I've removed 64GB of RAM from the server to be send to Jiro along with parts for the server he's building. When I started the server after that the Areca raid controller was gone from the system again. Another restart and it was back. Being able to reproduce that kind of behavior is never a good sign.

I've got back-ups of the complete VMs that run on the server but still it would suck if the raid controller would die completely before a replacement arrives. I've been looking for a second hand Areca controller. That's going to cost about 200 euro. For now I'm just going to hope that the controller will continue to function as we're not looking to be spending extra money on hobbies right now. We're in the process of moving and we've got quite a lot of expenses to cover in that regard. If it'll last a couple more months I'll be happy to replace it in the summer.

Anyway. Important thing to note is that the server got a big performance bump by taking out half the RAM. The reason for this is that one 64GB kit was rated for 2133MHz while the other is a 3000MHz kit which was running at 2133MHz. The new AMD CPUs benefit greatly from faster RAM which can't be said for Intel CPUs unfortunately. Right now the new TnFC server seems to be running about 25% faster than before. Or rather, it's using 25% less time per tick. Which is nice :)

InsaneJ
24th September 2019, 15:47
As some of you may have noticed the server reboots every few days now. Reason for this is the Areca 1680 controller which is very much ready to be replaced. I checked the purchase receipt for it and I've had this controller since April of 2010 and is has been running 24/7 ever since. That makes it a pretty awesome piece of hardware in my book :)

To replace the 1680 I have ordered a second hand ARECA ARC-1880DI-IX-12 SAS 6G 12 Port RAID Controller for $230. Two years ago this card still sold for $790 new. According to Areca support the ARC-1880 should serve as a drop-in replacement for the ARC-1680 card. I'll make backups before upgrading, but if all goes well it shouldn't be much more work than replacing the card, booting VMware ESXi and (maybe) importing the raid volumes and then the VMs on those volumes. If all goes well...

InsaneJ
30th September 2019, 18:49
The new raid controller arrived today. Customs screwed me over and added a 53 euro import fee....

In the next few days I'll be moving all the VMs from the Areca controller to the LSI one. In case the newer Areca card doesn't work with the current drives and configured raid sets and volumes, I can simply re-create those and then migrate the VMs there instead of having to recover from back-ups.

This means things will probably slow down a bit as everything has to run of two SAS drives for a bit. I'll also test the new card to see if it actually works. And when I'm satisfied it has a good chance of working I'll swap the old Areca for the newer one.

InsaneJ
25th April 2020, 10:08
It's been a while since the last update :)

Nothing much has changed hardware-wise, though we've been experiencing some problems with the new Vanilla server. The Java Virtual Machine(JVM) crashes every so often. For those that don't know Minecraft runs on Java which means it's code is processed through the JVM which translates Java code to machine code that your processor can understand. This way the developers of Minecraft (or in our case, Spigot) only have to write their code once and it can run on many platforms. Windows, Linux, FreeBSD, Mac. If it has Java installed on it, it can probably run the code.

So the JVM crashing is a very bad thing. In the past few months the file system of the Dynmap hard drive had gotten corrupted. I've even re-rendered Dynmaps for TFC and TnFC which took weeks. All of our Dynmap tiles (images) reside on a single WD Purple drive. The drive itself reports no errors. What I'm going to do now is move the virtual hard drive in which all Dynmaps files resides from the WD Purple drive to one of the raid arrays. Then I'm going to replace the WD Purple drive with an old hard drive I have lying around. Then move the virtual hard drive to that 'new' physical drive and see if that changes anything.

Technically I could do all these changes while the server is running. But to be on the safe side I prefer to bring the server down when swapping parts and handling cables. This means there will be some down time in our future so don't worry if the servers or site is down, it's probably me working on something.

InsaneJ
25th April 2020, 13:58
All Dynmap files are now running of a raid 5 array of WD red drives. I've removed the WD purple's volumes and configurations from the server and raid controller, it is no longer mounted or being used. We'll see if that makes any difference.

InsaneJ
23rd June 2020, 10:19
My plans for the upcoming weeks is to upgrade the server again. We're still having intermittent stability issues with the current server and it's getting old having to deal with those.

What I'm thinking of doing is replacing the 1920X Threadripper with an AMD 3900X or perhaps a 3950X on an X570 motherboard with 128GB of memory. Rough estimate:

700 for the memory
300 for the motherboard
450 or 775 for the CPU
75 for CPU cooler


So somewhere between 1500 and 1750 euros. I haven't decided on the exact memory type (ECC/non-ECC) and motherboard yet. I'm currently looking into VRM benchmarks.

The CPU will be a lot faster clock-for-clock so along with the extra memory the new server will offer enough capacity to run additional game servers like the new Technode Firmacraft for 1.12 pack.

As always. If you're feeling generous, you know where the donate button is :)

InsaneJ
23rd June 2020, 16:10
Alright, I bit the bullet and ordered the following parts.


Gigabyte Aorus X570 MASTER (https://www.gigabyte.com/Motherboard/X570-AORUS-MASTER-rev-10)
Excellent VRM setup. Only beat by boards costing several hundred euro more. It has two ethernet connections: 1gbit Intel and 2.5gbit Realtek. The Intel nic will be dedicated for the PFSense router while the Realtek nic will handle all the LAN traffic. The motherboard has support for ECC memory but I opted to go with a non-ecc kit (see below) because the kits on the "supported list" only went up to 16GB sticks and we need 32 to get to 128GB RAM total.
€379
4124

AMD Ryzen 9 3950X (https://www.amd.com/en/products/cpu/amd-ryzen-9-3950x)
16 cores / 32 threads. The fastest mainstream CPU currently available. Not much to say here. The only way to get an even faster CPU is to go with a high-end desktop system based on Threadripper or server grade Epyc. Intel just isn't an option anymore at this point.
€779
4125


Corsair Vengeance LPX CMK64GX4M2D3600C18 (https://www.corsair.com/us/en/Categories/Products/Memory/VENGEANCE-LPX/p/CMK64GX4M2D3600C18)
Two kits of 2x 32GB for a total of 128GB. This kit is on the "supported list" of the motherboard. Although it's rated at 3600MHz, when running 4 sticks of 32GB the memory speed will drop to 2666. I could have gone with a (very) slightly cheaper kit of 4x 32GB 2666, but with this kit I potentially have the option to run the memory at lower timings.
€668 (2x 334)
4126


Noctua NH-D15 (https://noctua.at/en/products/cpu-cooler-retail/nh-d15)
One of the best air coolers on the market.
€89
4127


VGA0419 (https://www.mc.co.th/eshop/vga/3526-pcie-vga-2d-power-saving-graphics-controller.html)
I've also bought a low power (2W) PCI-e graphics card. It's basically one of those onboard VGA adapters you see on server boards, except now on a PCI-e card. It's only meant to display the BIOS and text-based terminals. It's been running in the current server for almost a week now. Saves a bit on power and it was the last part I could swap out without having to replace motherboard/cpu/ram.
€8 on ebay
4128


All in all these parts cost a bit more than I initially thought they would coming in at a little over 1900 euro. The parts should arrive June 26th. Looking forward to assembling the new server :)

InsaneJ
26th June 2020, 17:12
Crapper. I just got an email notifying me that delivery of the CPU cooler will be delayed by one or two days.

Perhaps I can mount the current server's CPU cooler if I can find the AM4 mounting bracket for that...

InsaneJ
26th June 2020, 18:20
And another update via email stating that it will arrive today. I guess we'll see the items when we see them :)

InsaneJ
27th June 2020, 11:49
Yesterday the CPU cooler did arrive eventually. Today the rest of the parts should arrive. When they do I'll assemble the parts and test them. If they check out I'll take the server offline and do the upgrade.

I have no exact time estimate. If the website and Minecraft servers are down, you'll know I'm working on it :)

InsaneJ
27th June 2020, 20:23
I've upgraded the server. It took a bit longer than I had hoped but that's how these things go. And this time around I was too tired to plan everything out, get everything tested and running in a shadow configuration. So big-bang-all-at-once-upgrade it was :)

The server is now running the hardware listed above, with the exception of the 2W graphics card. The new motherboard doesn't seem to output anything on the card although it boots up just fine. So we could run it without display if we wanted to. For now I've installed a Matrox card.

The motherboard BIOS has been updated to the latest version. And to my surprise I could run the memory at 3600MHz. From everything I've read so far that shouldn't be possible. When running 128GB of RAM on an X570 motherboard, the RAM is supposed to be down clocked to 2133MHz. This gives you a performance penalty, but you do get to use all the RAM. In our case I set the motherboard to use XMP memory settings, it then selected 3600MHz and it seems to be running well so far.

With this new motherboard, I've decided to use the NVMe SSD I had previously disabled. It's now being used as a read cache for the storage drives. This should speed up stuff like chunk loading if those chunks have been loaded before.

The new motherboard has two LAN ports. One is 1gbit Intel, the other 2.5gbit Realtek. The Realtek card is currently not supported by VMWare ESXi so I had to install another network card. I had a quad port Intel server network card lying around so I used that. It works well enough but I'd rather get the onboard card working since I'd rather not have the extra expansion card installed.

Right now I have an Areca raid controller installed, but I'm going to add another raid controller soon. Before I had an LSI adapter with two 6TB SAS drives in raid 1. I'd like to migrate to raid 5 so I've bought a second hand Adaptec card and two more drives. That means that there would be 4 expansion cards installed. While there is room for four 16x size and one 1x size cards, I'd rather not stack the raid controller between a graphics card and the quad port network card. While there is a large fan blowing directly on all the expansion cards from the side, I'd rather leave more space to help with cooling. Not to mention it's a waste of power.

The server looks to be running well right now. Let me know if any issues crop up.

InsaneJ
19th July 2020, 10:56
So I ended up ditching the second SAS controller and SAS drives altogether and replacing them with four new 4TB WD Red drives. Though not the fastest they are meant to be run 24/7 and with several drives in the same chassis.

The hard drive configuration is now:

4x 3TB WD Red in raid 5
4x 4TB WD Red in raid 5
2x 256GB EVO PRO SSD in raid 1


There are now about a dozen or so VMs running on these drives and everything is responding well en feeling snappy. Unfortunately Friday one of the new 4TB drives failed. Usually when a drive fails it's near the beginning or near the end of it's lifetime so it's not entirely unexpected. Because RMA can take a while I've ordered another 4TB WD Red drive and will be replacing the faulty drive some time soon. When the faulty drive comes back from RMA I'll keep that as a spare I think. Or perhaps I'll put it in my backup server. We'll see :)

InsaneJ
19th July 2020, 23:51
I've replaced the faulty 4TB drive and the array is now being rebuild. I'll keep the VMs that are running of this raid array offline for the time being to help speed up the process. This includes the Minecraft servers. It should be done when I wake up in the morning :)

*edit*
Everything is back to normal.

InsaneJ
6th August 2021, 11:16
It's been a year. Time for some server stuff!

I noticed there's been significant chunk corruption on the Vanilla 1.16 world. The shopping island has been wiped along with half of someone's base near spawn and my witch farm is also cut in half. The automatic backups only keep a week's worth of daily backups. I manually backup those backups to my own PC from time to time, but as posted in a different thread the disk I use for that isn't in great shape so I'm unable to restore the corrupted chunks. Those chunks have been automatically regenerated but all the work that was put into them is gone I'm afraid.

I'm currently working on setting up better backup facilities. A friend has provided a Dell workstation. I've bought four 4TB WD Red drivers, 48GB ECC memory and an Areca raid controller. The daily backups will stay where they currently are, on a 1U rack server at my place. On the new server I'll be doing weekly and monthly backups. Weeklies will be saved up to 1 month and the monthly backups will be saved up to 1 year. The new server will be hosted off-site at my friend's place. He's also generously paying for the electricity.

So all in all this backup server, or server for backups, which ever you want to call it, it going to cost me around 600 euro and a bunch of time to setup. If you're feeling generous and would like to contribute, there's a PayPal button on the home page :)

InsaneJ
10th September 2021, 08:39
I got a second hand Areca of Ebay and put it in the backup server to run the new drives in raid 5. When configuring the controller I noticed this:
4240

Looks like the previous owner didn't factory reset the controller before sending it to me. It appears to be an old raid controller from Valve. Who knows, perhaps it was even used to serve game content to Steam users at some point.

DOM
10th September 2021, 12:14
HL3 confirmed!

Jiro_89
12th September 2021, 20:27
Oooooo

InsaneJ
23rd September 2021, 09:04
Like I've mentioned elsewhere I've upgraded the server with three 1TB NVMe SSDs. Two of these will host the game servers on a raid1 array while the third will be used for Dynmap.

Right now the server is copying all the files over from the old virtual hard disks to the new ones. This will take a while so I'll just let that run while I do other things :)

InsaneJ
23rd September 2021, 12:17
Lunch break update.

Copying the Minecraft servers from HDD to SSD went fairly quickly. I've added the appropriate services. Installed Java 8 and 16 from AdoptOpenJDK and put the correct paths to those in the appropriate Minecraft monitoring scripts. And then they all just started. No fuss or crazy errors.

4262


Now all that's left is to copy over the Dynmap tiles. This is going to take much longer. Our Dynmaps consists of millions of tiny jpg/png files. Copying those over is relatively slow. It's much faster to copy a single large file rather than a lot of smaller ones even if the amount of GB is the same.

Right now we have around 536GB of Dynmap data and so far 35GB has been copied to SSD going at an average rate of about 7MB/s. So for now the Minecraft servers will display Dynmap, you're just missing the tiles which means you're basically looking at the void. As the rsync job progresses the Dynmaps will become complete again. Remember when rendering the TerraFirmaCraft server's Dynmap took a week? This won't take that long though.

As far as performance goes. The Minecraft servers are noticeably faster now. Loading chunks when flying across the map is nearly as fast as when running it on a local machine. The parts of Dynmap that are already present also load much faster. It's great. I like it a lot :)