In this release we have updated the dashboard on our Hypernode Control Panel to display more configurable settings for your Hypernode and we have made our Hypernode configuration management more resilient against stubborn newrelic-daemon processes.
Additional Dashboard settings
During the development of our new Hypernode Control Panel which is going to replace our old legacy Service Panel we chose to go with an API-first approach. This means that our Control Panel web-application is simply a front-end for our public facing API. Everything we implement in the Control Panel you could in theory also implement yourself in your own web-application or command-line tool.
This makes it very easy for us to add certain features to the Control Panel like displaying the the values that you can configure from the command-line for your Hypernode. If you are logged in to your Hypernode with SSH you can run the hypernode-systemctl settings --help
command to see which settings you can currently configure using the Hypernode API.
$ hypernode-systemctl settings --help
usage: hypernode-systemctl settings php_version 7.1
The possible values are:
enable_ioncube ['True', 'False']
password_auth ['True', 'False']
openvpn_enabled ['True', 'False']
unixodbc_enabled ['True', 'False']
supervisor_enabled ['True', 'False']
mailhog_enabled ['True', 'False']
modern_ssl_config_enabled ['True', 'False']
support_insecure_legacy_tls ['True', 'False']
modern_ssh_config_enabled ['True', 'False']
mysql_tmp_on_data_enabled ['True', 'False']
redis_persistent_instance ['True', 'False']
firewall_block_ftp_enabled ['True', 'False']
disable_optimizer_switch ['True', 'False']
mysql_disable_stopwords ['True', 'False']
mysql_enable_large_thread_stack ['True', 'False']
mysql_enable_explicit_defaults_for_timestamp ['True', 'False']
rabbitmq_enabled ['True', 'False']
elasticsearch_enabled ['True', 'False']
elasticsearch_version ['5.2', '6.x', '7.x']
varnish_esi_ignore_https ['True', 'False']
varnish_large_thread_pool_stack ['True', 'False']
varnish_enabled ['True', 'False']
blackfire_enabled ['True', 'False']
permissive_memory_management ['True', 'False']
varnish_secret string
varnish_version ['4.0']
mysql_ft_min_word_len ['4', '2']
php_version ['5.6', '7.0', '7.1', '7.2', '7.3', '7.4', '8.0']
mysql_version ['5.6', '5.7', '8.0']
blackfire_server_id string
blackfire_server_token string
override_sendmail_return_path string
php_apcu_enabled ['True', 'False']
php_amqp_enabled ['True', 'False']
managed_vhosts_enabled ['True', 'False']
nodejs_version ['6', '10']
new_relic_enabled ['True', 'False']
new_relic_app_name string
new_relic_secret string
datadog_enabled ['True', 'False']
datadog_apikey string
datadog_region string
positional arguments:
{enable_ioncube,password_auth,openvpn_enabled,unixodbc_enabled,supervisor_enabled,mailhog_enabled,modern_ssl_config_enabled,support_insecure_legacy_tls,modern_ssh_config_enabled,mysql_tmp_on_data_enabled,redis_persistent_instance,firewall_block_ftp_enabled,disable_optimizer_switch,mysql_disable_stopwords,mysql_enable_large_thread_stack,mysql_enable_explicit_defaults_for_timestamp,rabbitmq_enabled,elasticsearch_enabled,elasticsearch_version,varnish_esi_ignore_https,varnish_large_thread_pool_stack,varnish_enabled,blackfire_enabled,permissive_memory_management,varnish_secret,varnish_version,mysql_ft_min_word_len,php_version,mysql_version,blackfire_server_id,blackfire_server_token,override_sendmail_return_path,php_apcu_enabled,php_amqp_enabled,managed_vhosts_enabled,nodejs_version,new_relic_enabled,new_relic_app_name,new_relic_secret,datadog_enabled,datadog_apikey,datadog_region}
positional_value
optional arguments:
-h, --help show this help message and exit
--value DEPRECATED_VALUE
This option is deprecated. Use the positional value
instead
We update this list of settings automatically when we add new functionality to the API. For example, the opt-in php_amqp_enabled
module is a configurable that we added recently to facilitate the software that some of our users run.
In the Control Panel on the dashboard page there’s a display of some of these values to give you a quick insight into what’s configured for this Hypernode. In this change we’ve selected a couple more values that we wanted to display on this page as well.
Because not all settings are as relevant as others there are some values that we’ve kept out, and there are also some values in there which can not be configured using the API as a user but can be configured in consultation with our Support department like a static InnoDB buffer pool size for MySQL.
Before:
After:
This update to the Control Panel will go live this week.
Dealing with stubborn newrelic-daemons
On the Hypernode platform we’re very particular about deploying updates to the system. Because stability and predictability is so important for E-commerce we are very careful about making changes and we strive to have as much control over the entire stack as we can. We are very diligent in the way we deal with software configurations and we do things like host our own operating system repositories and package a lot of our own software to ensure the chance of things entering our ecosystem without notice and without being properly tested is slim.
While we go far in trying to control all facets of the hosting environment, there are still factors of uncertainty that we can’t anticipate for. There’s users using the Hypernodes in ways that we could not have predicated (people are very creative!) and at the end of the day software remains software and things sometimes behave in a manner you did not foresee.
For that reason we have this development practice where for any expected functionality of the system we write a unit test. On Hypernode we call these the ‘nodetests’ and they can tell us immediately if something unexpected is going on on the server, as long as it happens in the bounds of things we have explicitly configured.
Every change that we do to the server we do with an Ansible task in an Ansible playbook. We write those tasks in an idempotent manner, meaning that it’s not just a script that runs once and will break if you run it again. The playbooks are meant to be ran multiple times and the eventual state should be the same (it ‘ensures’ the state). This is exactly what our automation is doing when you see update_node
in your hypernode-log
output.
$ hypernode-log | grep update
update_node 2021-06-29T07:50:09Z 2021-06-29T07:50:10Z running 2/3 full_update_node_to_update_flow
To make sure the state of each Hypernode is as we expect it to be we write a Python unit test for each Ansible Task to see if it did what we expected it to do. If any of those tests fail it means the system is in an unexpected state. These nodetests failing is exactly what you see when the output of hypernode-log
displays ‘reverted’.
$ hypernode-log | grep update
update_node 2021-06-28T08:11:02Z 2021-06-28T08:16:08Z success 3/3 finished
update_alerting 2021-06-28T08:10:59Z 2021-06-28T08:11:27Z success 2/2 finished
update_node 2021-06-28T07:30:03Z 2021-06-28T07:37:34Z reverted 1/3 finished
update_node 2021-06-28T07:26:10Z 2021-06-28T07:31:59Z reverted 1/3 finished
Generally this is no big deal and no reason for alarm because the automation will fix it with a subsequent configuration management update, or if it’s a real issue one of our support engineers will likely take note of it and fix the problem.
But sometimes we find software behaving in a way that is unexpected and we have to go debug and find out exactly why things aren’t going as we expected them to. This is exactly what happened this week with the newrelic-daemon
. On Hypernode you can enable and disable New Relic. When you enable New Relic it starts a service that can help you monitor and debug performance Bottlenecks in your application.
We noticed that it sometimes happened that when New Relic was first enabled on a Hypernode and then later disabled, the newrelic-daemon
process was still running even though the systemd service had been stopped. Further investigation showed that it was possible for the newrelic-daemon process to go into a state where it was not responsive to a SIGTERM signal but it would only be stopped by a SIGKILL (kill -9).
In this release we have updated our configuration management to ensure New Relic is forcefully killed if it should be stopped because it was disabled but the process won’t be terminated by a graceful signal.