Wednesday, August 29, 2018

Device Not Found when Removing HDD from mdadm RAID

I have encountered a case when I'm trying to fail and remove a disk from a mdadm RAID array using the following two commands:
  • mdadm --manage /dev/md1 --fail /dev/sdf --remove /dev/sdf
  • mdadm /dev/md1 --fail /dev/sdf --remove /dev/sdf
I encountered error "device not found".

Looking at the Disks (gnome-disks), the device /dev/sdf is still shows as a member of RAID /dev/md1.


What is the solution?

After examine using both:
  • mdadm --detail /dev/md1
  • cat /proc/mdstat
Confirming that /dev/sdf is not currently active in the RAID array, I execute mdadm --zero-superblock /dev/sdf. This reset the HDD and now it appears as an empty disk in Disks (gnome-disks).

I can then re-add the HDD to the array again as a spare unit, mdadm --add /dev/md1 /dev/sdf.

Saturday, May 26, 2018

Setting Up Red Hat Virtualization with Single Server

Red Hat Virtualization is an interesting and powerful platform software, but it is rather challenging to setup with only a single host when compared to Hyper-V, VMware and VirtualBox. One may argue that RHEV is not design for single host (I guess that is why there is no single article about how to do the setup), but who really care if just want to use for a tiny scale deployment? Single host rules!



Well, I could go for RHEL + KVM, but I simply like the interface of RHEV.
Getting it up and running requires user to have experience from numerous area of expertise. This simplified guide is not for newbie.

Setting up the RHVH is not hard. I'm going to skip it. Now, in order to get the hosted-engine up and running, please make sure the following are done:

1. Change the hostname localhost.localdomain to hostname. In my case, I use robustpoc.
2. Be sure to set a fix IP in /etc/sysconfig/network-scripts/ifcfg-eth0 and reconnect using the new IP using ifdown eth0 and ifup eth0.
3. Edit /etc/hosts to resolve the host IP address to the hostname.
4. In order to make this IP change persistent, be sure to change the settings in /var/lib/vdsm/persistence/netconf/nets/ovirtmgmt

Note: If the server is in a VLAN Network, be sure to set a correct gateway otherwise it would not be able to connect to Internet or reach by others in the network.

5. Now, before proceed with executing hosted-engine --deploy, you require to add your FQDN to /etc/hosts. In my case, I use 192.168.0.26 robustpoc.com. This is critical.
6. Next, setup NFS share. Follow the guide below:



7. At this point, we are ready to execute hosted-engine --deploy.
Three critical settings to take note during the process:
8. Be sure NOT to set the memory of the VM to maximum available memory of the system as per suggested by the setup. Setting to maximum will lead to no available memory to create any single Virtual Machine.
9. Be sure to use a fixed IP for the engine. Make sure gateway is correct too.
10. Be sure to answer YES when asked whether to update the /etc/hosts for both the host and engine VM. The default answer is NO. Missing out this would be a killer later.

Until this point, everything critical should have already been taken care of. Cross finger and I wish you the best luck!

Common errors that I have encountered but do not guarantee to happen:
1. Error: [Get Local VM IP] failed
2. Info: [Waiting for local VM to be up]
3. [ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook; then I can access to RHEV Manager but an failed external hosted engine present in the Compute > Hosts.

Welcome to share with me your installation experience, I'm happy to listen and learn from you.

Wednesday, December 23, 2015

Mdadm Must Be Configured on Partition

Refer to my previous post Missing mdadm array after reboot, I would like to add on the reason why auto-scan does not work once system rebooted - it requires the RAID to be configured on a partition instead of the disks!

This is going to have problem:
mdadm --create --level=6 --raid-devices=5 /dev/md0 /dev/sd[b-f]
 and this would be correct, please take note of the digit at the end of line:
mdadm --create --level=6 --raid-devices=5 /dev/md0 /dev/sd[b-f]1

It means you must format all the disks with at least one partition, and building the RAID using the partition.

I  hope this would be helpful for novice users who trying to get the RAID run properly.

Monday, November 9, 2015

Problem Updating Repo List in Ubuntu 12.10

The sources list has been changed to old-releases. To replace the sources list easily, use the following command:

sudo sed -i -e 's/archive.ubuntu.com\|security.ubuntu.com/old-releases.ubuntu.com/g' /etc/apt/sources.list

If you have choose another mirror then your sources list might contain us.archive.ubuntu.com, then the above changes would still be invalid. You would need to remove the us. from it. Using the command below will help remove it:

sudo sed -i -e 's/us.old-releases.ubuntu.com/old-releases.ubuntu.com/g' /etc/apt/sources.list

Sunday, October 4, 2015

Houdini Start-Up Crashed on Render Node

AFAIK, Houdini could not start its main application (GUI) without OpenGL enabled graphics card. Hence if you are trying to start it in a render node with ASPEED on-board graphics, it is impossible. All you can do is to run the render using CLI.

As for user encounters the same issue, if you are using Nvidia graphics card, you would need to switch the XORG graphics driver to Nvidia proprietary driver. You can do it at:

Menu > Settings > Software and Update > Additional Drivers

Sometimes the latest driver may not work well. So you may have to tried different version to get it work.

Tips: To get a clearer picture of why Houdini failed to start, instead of running it from software menu, you can run it from terminal. The error message will be more readable than the crashed dump file.

Wednesday, June 17, 2015

Missing mdadm array after reboot

It is heart-attack when you reboot your linux storage server and realized mdadm array gone missing!
  • RAID volume not showing in gnome-disks.*
  • gparted reports empty partition on all the RAID disks
sudo mdadm --examine --scan -v shows:
mdadm: looking for devices for /dev/md0
mdadm: Cannot assemble mbr metadata on /dev/sda
mdadm: Cannot assemble mbr metadata on /dev/sdb
mdadm: Cannot assemble mbr metadata on /dev/sdc
mdadm: Cannot assemble mbr metadata on /dev/sdd
mdadm: Cannot assemble mbr metadata on /dev/sde
sudo mdadm --examine /dev/sd*  or mdadm --query /dev/sd* shows:
mdadm: No md superblock detected on /dev/sd*
sudo mdadm --assemble /dev/sd[b-f] shows:
mdadm: device /dev/sdb exists but is not an md array.
mdadm: No arrays found in config file or automatically
If you go into /etc/mdadm/mdadm.conf (Debian / Ubuntu) or /etc/mdadm.conf (Fedora), you would realize there is no ARRAY being defined. In some system the mdadm.conf simply does not exist.

No matter how hard you try, the mdadm array just won't show up. There are numerous suggestion on the web, being the most common is to add an auto examine during boot in the mdadm.conf:
 sudo mdadm --examine --scan --config=mdadm.conf >> /etc/mdadm/mdadm.conf
However the above simply does not work for me. I have also try adding the ARRAY manually by doing the following, in which the UUID is the first hdd used for the array (to show the UUID, use blkid. Strange enough I could not get the UUID of my GPT hdd in Ubuntu, and only can get by using Fedora):
ARRAY /dev/md0 metadata=1.2 UUID="0db0c336:f56bd888:2f9e92e4:c1d64c09" >> /etc/mdadm/mdadm.conf
It does not work as well. I have no other solution except try to re-create the array again. Be very careful of this step, I'm more daring to take this step because I already backup my data one day before I lost the array. When you re-create the array, BE SURE to use --assume-clean and make sure the parameter set is EXACTLY the same as what you used to create it the first time. In my case, it is very simple and straightforward:
mdadm --create --assume-clean --level=6 --raid-devices=5 /dev/md0 /dev/sd[b-f]
The RAID array is being created and immediately I got all my data back!! It is advisable to backup your data now and once the backup is done, do a data scrubbing to ensure it is running well.

I hope you will be as lucky as I do, to have all the data back without losing great memory and important works.


External link that I used as reference:


* If it is shown, then you are lucky because it simply changes its name from /dev/md0 to something like /dev/md127. You can still use it by changing the mounting command to the new array name.

Tuesday, April 28, 2015

Never sign up VPS from this company

Recently I have sign up a VPS from a company call WideVPS, and it is really the worst hosting company that I have ever encounter. For those of you who like an affordable hosting, please be careful with the company:

  1. The VPS is activated after a few hours of waiting. (they claimed it is instant activation)
  2. Support ticket tooks 4 hours to response. (they claimed 1-hour response)
  3. Ticket was closed even the issue has not been resolved. I created a Windows VPS for testing before moving to linux, during the sign up I have key-in a password for the VPS. When it is finally up after 4-hours, the password I entered is not working!! I then open a ticket and email, it tooks 2 days and never resolved! All I get is.... I will send you the info in next email. Then? I receive no email at all.
  4. Submitted a PayPal dispute, but they never response to the dispute. (so you can see they are not serious in the business at all)
And this morning I receive their invoice asking me to renew the hosting, but until today I'm not able to use the VPS at all!

Guys and Gals, be careful. I'm glad if you have good experience with them. But if you are still considering, just make sure you contact them before sign up and then judge yourself. I hope you will then not wasting money like I did.