Whether Puppet or Chef or Ansible, the modules/cookbooks out there skip on the bootstrapping of the search cluster captain. Here is an implementation using Chef in AWS EC2.
Here is an implementation using Chef in AWS EC2.
A helper function to get the captain election status.
# chef/cookbooks/splunk/libraries/helper.rb
def is_captain_elected
splunk_cmd = "#{node['splunk']['server_home']}/bin/splunk"
command_out = ''
ruby_block "is_captain_elected" do
block do
Chef::Resource::RubyBlock.send(:include, Chef::Mixin::ShellOut)
command = "#{splunk_cmd} list shcluster-captain-info"
command_out = shell_out(command)
Chef::Log.info("shcluster-captain-info: #{command_out}")
end
action :create
end
if command_out.include? "elected_captain"
return true
else
return false
end
end
A helper functon to manage the EC2 tags used to indicate cluster node status.
# chef/cookbooks/splunk/libraries/helper.rb
# update/insert ec2 tag. Ensure ohai ec2 hints are dropped. touch /etc/chef/ohai/hints/ec2.json
# tag key: shc-member
# tag values:
# - unregisterd: instance has been initialised but not added to the cluster ie captain not bootstrapped
# - registed: instance has been added to the cluster and captain bootstrapped
#
def upsert_ec2_tag(instance_id, tag_value)
ec2 = Aws::EC2::Resource.new(region: "#{node['ec2']['placement_availability_zone'].chop}")
ec2.create_tags({
resources: ["#{instance_id}"],
tags: [
{
key: "shc-member",
value: "#{tag_value}",
},
],
})
end
A helper function to get the EC2 instance IDs and hostnames of search cluster members as an array. The EC2 tag name is 'shc-member'. This function retrieves instances based on the tag's value.
# chef/cookbooks/splunk/libraries/helper.rb
def get_members(tag_value)
ec2 = Aws::EC2::Resource.new(region: "#{node['ec2']['placement_availability_zone'].chop}")
members = []
ec2.instances(
{
filters: [
{
name: 'tag:shc-member',
values: ["#{tag_value}"]
},
{
name: 'instance-state-name',
values: ['running','stopped']
},
]
}
).each do |i|
members << {
"instance_id" => i.instance_id,
"dns" => i.private_dns_name
}
end
return members
end
The search head recipe installs Splunk and starts the Splunk daemon. It then performs the following logic to attempt to bootstrap the captain:
# chef/cookbooks/splunk/recipes/search-head.rb
# Initialise this node as a search cluster member
if node['splunk']['search_cluster_deployment'] && node['splunk']['search_cluster_member']
execute "Initialising search head cluster member" do
command "#{splunk_cmd} init shcluster-config -auth #{node['splunk']['auth']} -mgmt_uri https://#{node['fqdn']}:#{node['splunk']['mgmt_server_port']} -replication_port #{node['splunk']['search_head_cluster_replication_port']} -replication_factor #{node['splunk']['search_factor']} -conf_deploy_fetch_url #{node['splunk']['deployer_url']}:#{node['splunk']['deployer_port']} -secret #{node['splunk']['pass4SymmKey']} -shcluster_label #{node['splunk']['shcluster_label']}"
not_if "#{splunk_cmd} list shcluster-member-info -auth #{node['splunk']['auth']} | grep 'is_registered:1'"
notifies :restart, "service[splunk]", :immediate
end
end
# tag this node as unregistered
upsert_ec2_tag( node['ec2']['instance_id'], 'unregistered' )
# collect the unregisterd members
unregistered_members = get_members("unregistered")
Chef::Log.info("Found #{unregistered_members.count} unregistered members")
captain_elected = is_captain_elected
if captain_elected
# captain already elected so have to initialise then add this node ot the cluster
Chef::Log.info("SHC captain elected, adding this node via member")
# discover registered members
registered_members = get_members("registered")
# add member to cluster using one of the registered members mgmt URI
execute "Adding member to search head cluster" do
command "#{splunk_cmd} add shcluster-member -auth #{node['splunk']['auth']} -current_member_uri https://#{registered_members[0]['dns']}:#{node['splunk']['mgmt_server_port']}"
end
# tag this node as registered member
upsert_ec2_tag( node['ec2']['instance_id'], 'registered' )
elsif !captain_elected && unregistered_members.count <3
Chef::Log.info("SHC captain not elected and less than 3 members available to vote. Nothing to do")
elsif !captain_elected && unregistered_members.count >= 3
Chef::Log.info("SHC captain not elected but #{unregistered_members.count} members ready to vote, bootstrapping captain")
servers_list = []
unregistered_members.each do |i|
servers_list << "https://#{i['dns']}:#{node['splunk']['mgmt_server_port']}"
end
servers_list = servers_list.join(',')
execute "Boostrapping the search head cluster captain on #{node['fqdn']}" do
command "#{splunk_cmd} bootstrap shcluster-captain -servers_list #{servers_list} -auth #{node['splunk']['auth']}"
end
# update ec2 tags
unregistered_members.each do | i |
upsert_ec2_tag( i["instance_id"], "registered" )
end
else
Chef::Log.info("SHC is in an unknown state")
end