top of page
Search
  • Writer's pictureKoos Kombuis

Nutanix Cloud Manager (NCM)



Some people may be misled by the title of this blog to believe that Nutanix Cloud Manager (NCM) is a unified management tool that can handle your HCHCI environment both on-premise and in the cloud through a single GUI. However, it is actually a complex set of tools integrated into either Nutanix Central or Prism Central, along with Prism Element, resembling a hierarchical structure.


Prism has transformed into a software-based Matryoshka doll, with Nutanix Central as the largest doll, Prism Central in the middle, and Prism Element as the smallest doll.



Nutanix Central was launched around May 30th 2024 and is supposed to be the Nexus of up to 25 Prism Central Domains for larger multi-national corporations across on prem and various NC2 instances of Nutanix clusters in the public cloud.


Nutanix Central and Prism Central are not the same thing however, so you have to pay attention on this one to grasp the finer details and concepts.


Currently, Nutanix provides three platforms within the cluster management category - Prism Element, Prism Central, and Nutanix Central. Although they claim that Nutanix Central is separate from Prism, I personally believe that it is actually a part of the Prism schema.


Prism Element and Prism Central have no cost of acquisition by the way and PC runs as a VM on one of your clusters while Prism Element is embedded in each node by default but they do have a cost in resources, as I mention in this blog a few times here and there.


Nutanix kinda-sorta achieves the Prism Central centrally nested goal via accessing their various management features care of their Prism Central (PC) GUI which is in the business of nesting their many stand alone add on features, seemingly under the control of Prism Central, smoke and mirrors style.


Some of these "features" you get with the standard Starter edition, they bundle more add on features with NCM Pro and the whole enchilada of everything they have comes with NCM Ultimate per the table listed further below.


However, you will notice for example, that when you launch the NCM Cost Governance tool from within the PC GUI (which is just the old but excellent Nutanix BEAM cloud cost management tool), that it actually fires up the BEAM GUI from within your HTML 5 web browser and you are suddenly in stand alone BEAM world - my version did not have a back button to get back into PC from the Cost Governance tool and that is because you can also run it solo without PC and many Nutanix customers who do not have any Nutanix HCI nodes actually do run the BEAM package solo from any HTML 5 web browser.


For customers who do have on premise HCI clusters, it would be nice if the add on tools could detect that and flip you back to the master control GUI you started from which in my case is Prism Central.


The loss of Navigation control up and down I find most irksome indeed.


I am still not sure if this nested within Prism Central parasitic approach is the best way to do things but you do get used to doing everything from Prism Central after a few days of use.


However there are still things Prism Element does that PC does not yet do and do not forget that Prism Central is running on at least one dedicated VM.


While I personally see Prism Central as a unique collection of HTML favorites for all things Nutanix-related, the reality is that it is much more complex than that, regrettably.


Currently, a heated discussion is taking place on social media regarding whether meeting the Tax Prism Central demands justifies the associated overhead. The decision on whether this is worthwhile depends on your specific circumstances and the tasks you are carrying out with your HCHCI cluster or clusters.


Convenience here comes at a steep resource cost and it is full of irksome bugs to boot.


To ensure proper functioning of Nutanix CVM, it is recommended to begin with 64GB of RAM and 8 cores for medium-sized environments, and then upgrade to 128GB of RAM and 10 cores for bigger cluster workloads.


Do not forget you need resources for your virtual machines as well!


I received an education on this topic from a major electronic payments company based in San Jose and a reseller in Fresno, confirming that it is indeed an unfortunate reality.


A reality Nutanix stubbornly ignores.


For comedic purposes and a fondness for dark humor, the Nutanix sizer tool initially features 4 Cores and a mere 28GB of RAM when suggesting CVM resource overhead.


Despite the presence of a few bugs, the 2024 versions of Prism Central have generally performed well as long as all the Prism Central add-ons are not all run simultaneously.


Nevertheless, it will remain a combination of JSON and Python like a Frankenstein lab creation until it is rewritten correctly at some point.


Therein lies the answer to the question and in response to all this argument on Reddit forums et al on June 5th 2024 Nutanix released a new Xtra-Small Prism Central deployment option to go with the small deployment option they already had.


Currently, I am testing it on two HPE DX365 3-node clusters and I have determined that it operates quite reliably, with some caveats.


If you are looking for a straightforward VM hosting system without any extra features, consider opting for the Xtra Small option. It can be quite convenient for managing multiple clusters from a single console, especially for your core cluster management tasks.


Vanilla Prism Element is suitable for those who only have a single cluster!


I faced a dilemma when I required a virtual machine host system that would allow me to fully utilize every core and MB of RAM without any limitations. I was frustrated when my PC consumed 32GB of RAM and 10% of my CPU's performance even before launching a single VM, leading to a major loss of humor on my part.


I managed with the Xtra-small instance but am constantly wondering if I should revert to Prism Element (I ended up doubling my RAM).


AOS was taking 32GB, PC took 32GB making it very clear 64GB RAM on the node running PC was not going to cut the mustard here so I initially ran PC on the 128GB Clusti node.


The fact is starting RAM for nodes running PC should really be 256GB for 20 smallish VM’s and 512GB should be the realistic starting point for 35 plus mid sized VM’s.


As I alluded to in another legacy blog posting, after much ASUS TUF motherboard dramas I ended up with one host with 128GB RAM and another with 64GB RAM and I ran PC on the 128GB box and joined the 64GB RAM box to the former clusti instances Prism Central.


I then split the workload over the two machines with the hard core compiling and AI stuff running on the 64GB Clusti instance that was unfettered by PC management tasks.


As I said I ended up doubling the RAM on each Clusti box to get PC to work as I desired and now the small cluster runs 128GB RAM and the one running the PC VM is now armed with 256GB of RAM.


When managing multiple clusters that have basic asynchronous replication between a production cluster and a DR cluster, or when following the primary-secondary architecture, it is essential to consider the choice between using Prism Element and Prism Central carefully. This decision is particularly critical for smaller customers with restricted RAM and cores, as they might have to decide between the more resource-demanding and potentially troublesome Prism Central management or the simpler yet dependable Prism Element approach.


If you only have one production cluster, I would strongly recommend against deploying a Prism Central instance. The overhead involved is comparable to running several additional virtual servers more efficiently (with Prism Element), providing greater stability and resources compared to managing your cluster with what could be a nightmare Prism Central experience.


The explanation for this is that Prism Central fully loaded (NCM Ultimate) is not quite ready for mainstream use yet and has recently progressed from the Alpha stage to a stable beta stage in development, in my humble opinion.


Nonetheless, as I hinted at earlier, this also relies on the manner and purpose for which you are utilizing Prism Central.


Prism Central is quite handy if you simply need to utilize it for managing your VM's OS Images and implementing some basic replication between your two clusters for workload balancing purposes.


This presupposes that you are willing to forgo the overhead slice that Prism Central will require.


If you are satisfied with the performance of Prism Element, its reasonable overhead, and rock-solid reliability, and you need to maximize your limited resources, there is no need to feel embarrassed about operating in this manner.


Simply inform the Nutanix team that you are addressing the actual situation and resist their bizarre plans regarding Prism Central.


Having said that let's dive into what Prism Central can do for the Vanilla users!


You will notice when you click on the triple line hamburger at the top left of your Prism Central browser session that the GUI menu has a lot of entries and if you expand them all it does a lot more than what Prism Element does.


But ask yourself if you really need all those features for what it is you are actually going to be doing.


By consolidating all their additional features (currently numbering around 19) within Prism Central, such as NCM Cost Governance, FLOW, and Security Central, Nutanix has exposed some weaknesses in the software development of NCM.


It would be beneficial for Nutanix to examine the Commvault Simpana Software bus architecture and consider developing their software to integrate into a shared bus in a comparable manner.


Nutanix heavily relies on Python and JSON in their control plane stack, which has led to issues that are fueling the debate against PC in general, in my humble opinion.


I discovered that managing your clusters becomes incredibly easy when you simplify PC (NCM Starter). It's important to have multiple clusters for PC use rather than just one, and it's best not to overload the PC stack with too many components. Therefore, NCM Ultimate may not be practical unless you are operating a small Prism Central-only cluster for running Prism Central on.


There are also some BC/DR pitfalls that you should consider carefully.


Nutanix heavily depends on REST API, JSON, and Python elements.


While there are many advantages to using REST API, it's important to note that if Python and JSON are not functioning properly in the stack, you may encounter significant issues.


I am surprised that Nutanix continues to use JSON, as it is outdated, unreliable, and lacks stability.


In my view, numerous Nutanix processes encounter problems due to JSON issues, causing the Python and other code relying on JSON to come to a halt.


Deploying that setup on containers with suitable code for that specific environment could be a potential solution, although it may require a significant amount of storage, memory, and Core overhead for orchestration purposes and for the tasks the PC instance performs, if executed correctly.


This is the reason why, if you are committed to ensuring a reliable Prism Central control plane experience, you should establish a dedicated Prism Central management cluster specifically for this purpose. It may even be necessary to set up multiple Prism Central only clusters if you want to ensure seamless PC disaster recovery or deploy containers within these Prism Central clusters.


These suggested management clusters will NOT run anything other than Prism Central and a few carefully selected PC bolt on features per management cluster.


A major online electronic payment company headquartered in San Jose is currently experimenting with this architectural model on AMD-based Lenovo nodes in their laboratory, and the data from their tests seems quite conclusive to me on this matter.


Certainly, proposing distinct management clusters in addition to the typical standard clusters may not be well-received by the financial team in any corporation. However, pragmatically, this approach is necessary when dealing with more than 60 clusters.


The move by Nutanix to embrace public cloud may seem puzzling at first, especially when considering that many of the pioneers in the public cloud space have reverted to a hybrid cloud approach. This aligns with my belief that hybrid cloud was the way to go as far back as 2012.


It appears that Nutanix is lagging behind the trend by 8 years in this aspect, and I believe that not many would opt for an IT strategy solely based on flaky cloud services.


When comparing a single point of failure in a traditional data center to relying solely on the cloud, it becomes clear how risky it is to put all resources in one cloud provider.


Not many rational individuals would visit that place.


Some individuals opted for a combination of Azure and AWS, but having two single points of failure (SPOF) still makes the stacks vulnerable to SPOF issues.


Public cloud can be rather useful, but transforming applications to be cloud-compatible by rewriting them can be extremely costly and mega impractical.


What if we deploy Nutanix cluster technology on Azure and AWS bare metal machines to eliminate numerous single points of failure issues commonly found in public clouds, all without the need to rewrite the software?


In my opinion, this is the most significant benefit that Nutanix NC2 offers for AWS or Azure.


It's important for people to realize that deploying Nutanix NC2 on Azure or AWS bare metal servers is different from using Nutanix NCI/NCM on Azure or AWS cloud stacks, as the latter option is not possible at this point in time.


In other words, Nutanix successfully negotiated to obtain a few configurations of bare metal servers exactly the same as those found in the data centers of AWS and Azure.


This is just a bare metal server you would buy from the usual server OEM vendors only in the case of both Azure and AWS it is customized and configured for running AOS on the Azure Infrastructure Stack or the AWS infrastructure stack in a few limited configurations of bare metal server offering.


Once you subscribe to the AWS bare metal servers for running Nutanix AOS on you still have to pay Nutanix for their NCI/NCM subscription combo that has AOS which they run on the public cloud bare metal server offerings.


The sole purpose of doing this is to avoid rewriting your legacy applications, also known as refactoring for the cloud, as you are not actually running them on the cloud with NC2.


It's crucial to compare servers from various manufacturers such as Dell, HPE, Cisco, Lenovo, etc., with AWS EC2 bare metal instances or Azure AN bare metal instances for a direct comparison.


The same rules apply, you still need a minimum of three servers (nodes) for a basic cluster.


One key distinction between servers from various OEM hardware vendors and the limited number of cloud bare metal servers is that the configurations of bare metal cloud nodes are fixed and cannot be altered.


The CPU, RAM and storage they come with are what they come with.


One major benefit is the ability to create a 3-node pilot cluster with elastic expansion capabilities, allowing you to add extra nodes when necessary, such as during a DR fail-over incident.


By following this approach, Disaster Recovery (DR) can serve as a fundamental framework. When switching to DR, you have the flexibility to increase capacity as needed and only incur costs for the duration of your DR scenario.


Once the DR event is resolved, you switch back to your usual production system and keep the 3 node pilot instance prepared for the next DR event.


You can achieve all this without needing to rewrite your reliable and trustworthy apps for the cloud.


To emphasize the importance of running typical applications for many customers, I continue to support clients who use JD Edwards accounting software on NT4 platforms.


My last functioning synapse finds it challenging to justify the significant cost increase associated with relying solely on a Public cloud for a solution that has numerous single points of failure.


This is particularly true when considering the exorbitant expenses of public cloud services.


However, it is worth noting that my computing career began with IBM and Hitachi Mainframes, which were also quite expensive at that time.


Currently, there are new cloud solutions emerging that do not operate on Azure, AWS, or GCP. This results in significantly lower costs compared to traditional cloud services offered by these major providers.


In my humble opinion, moving to the public cloud is not a cost-saving measure at all. It simply serves as a costly insurance policy for ensuring business continuity.


I think that Mainframe computing without internet and email distractions will become increasingly popular again due to the time wasted on internet browsing and email management by most people.


Running everything on a platform with numerous single points of failure and outages, where a cloud service going down could leave the business stranded and unable to operate, would require a very brave few to consider it a reliable approach for solid business continuity.


Perhaps it is indicative of the current era that people consider this approach to be favorable. However, in my experience, downtime leads to financial losses, and businesses that fail to generate income quickly go bankrupt.


Just saying...


However, I digress from the point of this blog post which is NCM.


So let's dive in deeper to see what NCM is and what it can do for you, exactly.


To begin with, I observed a lack of clarity regarding the understanding of NCM. Let's examine the Nutanix feature sheet to explore the different tiers available: Starter, Pro, and Ultimate!


Gratis/Free Paid Paid

NCM Starter is included with every Nutanix cluster at no additional cost; it is bundled with the basic features as part of the package. However, the VM self-tuning features are not available with the starter version.


Both NCM Pro and Ultimate include the Cost Governance module. However, Pro offers various self-service features, while the Ultimate version provides a comprehensive package with Security compliance and advanced Governance.


The NCM suite comprises a total of 19 different features and capabilities, all accessible through Prism Central, which serves as the central control center for presentation and launching.


In my view, combining security and regulatory compliance into one package for a Prism Central instance or instances is not the right approach.


By the way, you can have one Prism Central instance per cluster, but this goes against the federated approach that Nutanix is currently focusing on for further releases of their management software.


Dealing with multiple individual Prism Centrals also introduces new challenges in terms of replication and BC/DR fail-over situations.


My Rabbi sums this up with a single phrase: "Oi Vey!"


Many CISO professionals would insist on separating security admin from regular admin access to comply with security standards and best practices.


Therefore, this model is flawed from the beginning, even when utilizing the RBAC features, which are considered too basic for most CISO professionals to contemplate seriously.


Enabling all of these features in a PC instance can lead to some instability, although the 2024 versions of Prism Central are significantly "more" stable compared to the 2023 versions.


Nutanix addresses and resolves code issues much faster than any IT company I have previously worked with. They do not follow the traditional approach of IBM and EMC, where they would only address an issue if numerous customers complained about it not functioning as expected.


This is changing for the worse as Bain influences come to bear in the Nutanix board decision making process.


Nutanix should increase their utilization of AI tools for enhancing code quality and incorporate more 4GL styled tools similar to the legacy Da Conti Sensible Solution stacks.


Those Da Conti folks were ahead of their time by about 40 years.


To sum up, it is important to consider the use of Prism Element thoughtfully, even if Nutanix representatives suggest otherwise, especially when expanding the functionalities of Prism Central beyond the core NCM features.


Individually, each feature is highly valuable and useful. Originally, they were standalone tools, but now they are being combined with a wide range of other tools in a meat grinder-like fashion.


I believe that the unique characteristics and effectiveness of standalone tools have been compromised by the Prism Central Monster Smash.


It is suggested that Nutanix develop a new software bus architecture where each feature set can be easily integrated, using non-Java programming language, and enhance its visual appearance and actual expected functionality.


CIO and CFO executives are unsatisfied with the reporting and exporting features, as they seek more comprehensive graphs to utilize in their CFO duties.


However, what do I know? I have only been selling Nutanix since 2012....










Comments


bottom of page