Subscribe: Ergonomie web, Ruby on Rails et Architecture de l'information
http://feeds.feedburner.com/FredericDeVillamilcom
Added By: Feedage Forager Feedage Grade B rated
Language: French
Tags:
cluster  data  elasticsearch  index  indexes  mac  master  node  nodes  routing allocation  time  web  “cluster routing 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Ergonomie web, Ruby on Rails et Architecture de l'information

Fred Thoughts on Startups, UX etc | The UX Ray



The UX Ray



 



Dads are great at parenting too (and they love it)

Sat, 18 Feb 2017 08:11:23 +0000

I’m a happy father of 3, and even though I’m working a lot, I try to spend most of my free time spending quality time with my kids. This includes playing with them, cooking, or helping them doing their homework. I love it and they love it too. Unfortunately, I often face some kind of ordinary sexism when it comes to dads parenting. Every time my wife’s away for a few days, leaving my with the kids, her friends offer to invite the kids “to cut me a little slack”. I know I’m not the only one, and it sounds nice from them, but the truth is different. Oh, you’re leaving your husband all alone with the kids? My, what a terrible time ahead. They’ll spend the whole weekend in front of the TV, eat ham and pasta at every meal, and they won’t leave the house at all. He’ll forget to change the diapers, and will leave your home in a terrible mess you’ll spend weeks to clean. Last time my husband was home alone with the kids, he forgot to pick them up at school, left them alone doing every possible foolishness, burnt all the pans and phoned me to ask if he had to wait for the water to boil before he adds the pasta. True story. I’m not playing the “not all dads” game here. The fact that your husband might not be at the top when it comes to parenting doesn’t mean other men aren’t. Dads love parenting, and they’re great at it. Thinking that your husband, or your best friend’s is unable to take care of the kids and house is the same as people saying that women should be paid less than men because they’re less capable, that their place is at home parenting and doing the housing, or that they had their job because of a short skirt and deep cleavage. In one word: sexism. It’s 2017 and the modern dad knows that: The kids deserve better than spending the whole weekend in front of the TV. It’s actually Saturday 11:00 AM, the homework is done and the whole family (minus mom) is ready to leave home to visit the Greek department of the Louvre museum / the Dark Age department of the Museum of Army (because of the badass knights arbors) / a walk in the forest / whatever they’d love to. The youngest leave school at 4:20, the eldest at 4:30, they only go to the post school studies on Monday and Tuesday, and they use half priced train tickets, but they need their ID card for this. Ham and pasta are old fashioned and the only reason he cooks them some sausages and (homemade) mashed potatoes is because he loves that too. He also cooked a delicious veal stew, a wonderful boiled chicken, and will only take them to the restaurant because it’s a way to stay longer at the museum. The bedroom needs to be tidy between the shower and dinner time. Kids get easily dirty so they need to wash at least once a day, and change their clothes every morning. Their daughter is way cuter with a freshly ironed dress, which he does perfectly by the way. Kids need to go to bed early during the week, and a bit later during the weekend, because we’re all watching that wonderful Blu Ray of Disney’s Alice in Wonderland so they all sing the Unbirthday Song together, and run around in circles on Monday morning shouting I’m late, I’m late for a very important date / No time to say hello goodbye! I’m late, I’m late, I’m late! When a kid is sick at night, they need to change the sheets and put the dirty ones in the washing machine even though it’s 4AM and they’re tired. A warm hug and go back to sleep! And no milk for breakfast. Half a dozen 5 years old little boy at a birthday party need to be taken care of all day. They have already imagined some cool outdoor games, and even have an indoor B plan just in case. Indeed, when you come back home, with the kids ready to go to be and a perfectly clean house, he’ll look exhausted as if he just ran a marathon, but only to make you feel guilty for leaving him without anyone to take care of him. So next time your friends will be worried because they’re living the kids home alone with their dad, just reply “I don’t know why you’re worrying. Every ti[...]



15 Mac applications I can’t live without, 2017 edition

Sat, 18 Feb 2017 08:09:51 +0000

What are the first applications you install on a brand new Mac? That’s a question I didn’t have to ask myself until last month when my significant other had to start from scratch with a brand new Mac because backups are for the weak. So I started to think about it: what would I do if I had to start from the ground? After I replace Safari with Chrome, I’d install the first applications I can’t imagine living without: Alfred Textmate 2 and iTerm 2 in that order. Then, I’d be ready to setup everything I need for productivity, tech stuff, writing, and entertainment**. Whether you are a new Mac user lost in the App store or an experimented user looking for inspiration, that list was crafted for you with much love. And my favorite text editor. Productivity Airmail Let’s be honest. All mail clients suck, and Apple built in Mail.app is no exception. Some of them, like the good old Mutt suck less, and Airmail is one of them. Airmail is a fast, multiple accounts email client. It provides a handy replacement to Mac OS default Mail.app when it starts crashing and get sluggish because you’ve decided to keep a decade of email. Like any modern email client, Airmail natively supports Gmail, Yahoo!, iCloud, Microsoft Exchange, Outlook, or iMap. And if you’re still stuck in the 90’s, no problem: you can use AOL or POP3. Airmail provides a great community supported plugin integration, including a fast to setup PGP integration. The plus: Airmail supports Spamsieve natively so you control your Antispam, not let it control you. The cons: Airmail does not support text only email. Price: 9,00 €, available on the Mac App Store. Alfred Alfred is a keyboard based application launcher, multi platform search wrapper. Since I met Vero at LeWeb 2009, I’ve been an avid Alfred user, having it replacing all my Mac OS X built in applications. The little application launcher quickly became the center of my engineering productivity which improved as it was growing up. Price: 15,00€, with a free limited edition, not available on the Mac App Store. F.Lux F.Lux helps your brain awaken in the morning and go to sleep when it’s time to shutdown by changing your screen colour temperature according to the time of the day. No magic here, only science. At dawn, the light colour is colder, and your brain is wired to understand it’s time to wake up. At sunset, the light is much warmer, which is a sign it’s time to go to bed. F.Lux simply tricks your brain into knowing what time is it since your screen usually provides a colder light. Price: free, not available on the Mac App Store. NVAlt 2 Notational Velocity Alt 2 is a simple keyboard controlled note taking app. It provides basic feature like spell checking, URL recognition, basic search, online synchronisation and encryption. And that’s all you need. I use Notational Velocity to store everything I need to keep at hand, from basic code snippets to meeting memo I need to keep for later. It’s small, lightweight, less obtrusive than a post it application and won’t let countless text files on my Desktop as I used to do when writing everything in a text editor. Price: free, not available on the Mac App Store. Simplemind Simple Mind is a lightweight mind mapping application for Mac OS. Mind mapping is a productivity technique where you connect your ideas to build a clear thought diagram. Each idea might have different development and mapping them might help you to order every option of a complex process. I picked up Simple Mind over every other mind mapping software because it’s simple and available on iOS as well, which makes it perfect for people like me. Cons: poor keyboard navigation support. Price: 29,00€, with a free limited edition, available on the Mac App Store. Spamsieve Spamsieve is a powerful anti spam software for Mac. It plugs on most email client on the market, analyses email you get and learns from your habits. Despite looking expen[...]



An Advanced Elasticsearch Architecture for High-volume Reindexing

Sun, 30 Oct 2016 18:29:32 +0000

I’ve found a new and funny way to play with Elasticsearch to reindex a production cluster without disturbing our clients. If you haven’t already, you might enjoy what we did last summer reindexing 36 billion documents in 5 days within the same cluster. Reindexing that cluster was easy because it was not on production yet. Reindexing a whole cluster where regular clients expect to get their data in real time offers new challenges and more problems to solve. As you can see on the screenshot below, our main bottleneck the first time we reindexed Blackhole, the well named, was the CPU. Having the whole cluster at 100% and a load of 20 is not an option, so we need to find a workaround. This time, we won’t reindex Blackhole but Blink. Blink stores the data we display in our clients dashboards. We need to reindex them every time we change the mapping to enrich that data and add new feature our clients and colleagues love. A glimpse at our infrastructure Blink is a group of 3 clusters built around 27 physical hosts each, having 64GB RAM and 4 core / 8 threads Xeon D-1520. They are small, affordable and disposable hosts. The topology is the same for each cluster: 3 master nodes (2 in our main data center and 1 in our backup data center plus a virtual machine ready to launch in case of major outage) 4 http query nodes (2 in each data center) 20 data nodes (10 in each data center) The data nodes have 4*800GB SSD drives in RAID0, about 58TB per cluster. The data and nodes are configured with Elasticsearch zones awareness. With 1 replica for each index, that makes sure we have 100% of the data in each data center so we’re crash proof. We didn’t allocate the http query nodes to a specific zone for a reason: we want to use the whole cluster when possible, at the cost of 1.2ms of network latency. From Elasticsearch documentation: When executing search or GET requests, with shard awareness enabled, Elasticsearch will prefer using local shards — shards in the same awareness group — to execute the request. This is usually faster than crossing racks or awareness zones. In front of the clusters, we have a layer 7 load balancer made of 2 servers each running Haproxy and holding various virtual IP addresses (VIP). A keepalived ensures the active load balancer holdes the VIP. Each load balancer runs in a different data center for fault tolerance. Haproxy uses the allbackups configuration directive so we access the query nodes in the second data center only when the two first ones are down. frontend blink_01 bind 10.10.10.1:9200 default_backend be_blink01 backend be_blink01 balance leastconn option allbackups option httpchk GET /_cluster/health server esnode01 10.10.10.2:9200 check port 9200 inter 3s fall 3 server esnode02 10.10.10.3:9200 check port 9200 inter 3s fall 3 server esnode03 10.10.10.4:9200 check port 9200 inter 3s fall 3 backup server esnode04 10.10.10.5:9200 check port 9200 inter 3s fall 3 backup So our infrastructure diagram becomes: In front of the Haproxy, we have an applicative layer called Baldur. Baldur was developed by my colleague Nicolas Bazire to handle multiple versions of a same Elasticsearch index and route queries amongst multiple clusters. There’s a reason why we had to split the infrastructure in multiple clusters even though they all run the same version of Elasticsearch, the same plugins, and they do exactly the same things. Each cluster supports about 10,000 indices, and 30,000 shards. That’s a lot, and Elasticsearch master nodes have a hard time dealing with so much indexes and shards. Baldur is both an API and an applicative load balancer built on Nginx with the LUA plugin. It connects to a MySQL database and uses Nginx memory for caching. Baldur was built for 2 reasons: to tell our API the active index for a dashboard to tell our indexers which indexes they should write in, since we manage multiple versions of the same index. In Elasticsearch, each index has a defined nam[...]



That battle for Web standards we used to fight

Sat, 15 Oct 2016 14:30:55 +0000

Do you remember when fighting for the Web standards was cool and the W3C HTML validator was a thing? I do, and that’s great if you don’t. It means you’re younger than me and that long, exhaustive battle against a Web designed for Internet Explorer 6 is a thing from the past. I became a Web standards advocate somewhere between 2002 and 2003. Back then, I was running Linux as a desktop and was furiously pissed off by Web sites that did not display properly under the Mozilla suite because 95% of the world was using Internet Explorer 6. We have to test our Web sites against Internet Explorer 5.0, 5.2 for Mac, 5.5, 6.0, various flavors of Netscape Navigator, Mozilla, Opera and Safari. And none of them rendered CSS the same way. Imagine a world where you need rounded gifs because your browser can’t display rounded corners, a world where padding in floating elements does not behave the same way on all browsers, and where CSS adoption still has a long way to go. Why would you use CSS when designing with tables and inline style does the job? As a Web developer, all is was required to by my management was developing for Internet Explorer 6 and Internet Explorer 6 only. Every other browser was considered worthless, and with a monopolistic market share, Microsoft did not have to worry about releasing a more modern Web browser. In 2002, the Web standards community was still small and resources were scarce. Eric Meyer on CSS was the Web developer Bible, Molly Holzschlag our goddess and the CSS Zen Garden our lighthouse. My best memories as a Web standard advocate are meeting Molly at her hotel in Paris, eventually getting my copy of The Zen of CSS design signed, an giving a talk with David Larlet at Paris Web 2008. We were a group of idealists, fighting daily for the Web we wanted to live in, teaching and evangelising our colleagues and families about dropping that outdated IE6 for Firefox. 14 years later, Internet Explorer 6 is dead, the battle of the CSS Adoption is a thing of the past, but there’s still lot to do. The fight has moved to the accessibility level. For many of us, accessibility was already something, but there’s a long way to go. In 2016, many people still can’t use the Web because of a disability or lack of access to a broadband connection. The advent of unobtrusive Javascript and the rise of the frameworks were a great thing for accessibility. They allowed Web developers to write (more) accessible modern Web applications without even thinking or knowing about it. Prototype and Scriptaculous, the first widespread Javascript frameworks were all but accessible but their wow effects drove people out of their homemade Javascript stack. Their successors did better and better. The second battle towards a more accessible Web lies at school. I remember teaching future Web developers about producing valid HTML and splitting structure and design. This is not a thing anymore, and teaching goo practices at school is now the thing. I respect people at Opquast tremendously for providing a handy referential, great tools to validate a Web site accessibility and for their continuous evangelisation job. The next step is making the payers to understand that accessibility is not about targeting the 10% people with disabilities but making sure everyone can use their Web site. It requires Web developers time to do their job correctly, and upfront thinking everybody is not willing to invest in. Law enforcement was a good thing too. Forcing public organisations and large corporations to provide accessible Web sites was a great step, but only when they came with real penalties for those who don’t comply. Well, these were my (not so) nostalgic memories of the weekend. The memories of a small but awesome community, shared moments and daily struggle that’s hopefully not a thing anymore.[...]



Getting rid of the fantom indexes menace on Elasticsearch zombi masters

Tue, 11 Oct 2016 08:01:52 +0000

(image)

Split brains is a recurring problem when running any kind of clusters. A sudden server crash or network partition might lead to inconsistent state and data corruption. Elasticsearch addresses this problem by allowing multiple nodes to be configured as master. Running an odd number of master node and properly setting discovery.zen.minimummasternodes to (number of master nodes / 2) + 1 is an easy way to prevents split brain disasters.

However, there’s still a case your cluster might find itself in an inconsistent state.

When your master node leaves the cluster for some reasons and won’t reconnect by itself, it keeps a list of indexes existing before the split. Our clusters are living things, and we create and delete indexes all day long. When your long lost master comes back from the dead, you’ll notice some strange messages in the logs:

[2016–10–09 16:35:12,071][INFO ][gateway.local.state.meta ] [esmaster01] [183524] dangling index, exists on local file system, but not in cluster metadata, auto import to cluster state [YES]

These are the indexes your master used to know about before coming back. Elasticsearch considers these indexes actually exists and will import them into the elected master.

[2016–10–09 16:35:16,715][DEBUG][gateway.local.state.meta ] [esmaster01] [183524] no longer dangling (created), removing

That’s the moment your cluster turns red and newly created indexes appear when running GET /_cat/indices, except the data don’t exist anymore. The only way to bring it back to green is to delete those fantom indexes one by one using DELETE. Nothing complicated except a large number of freshly created indexes might put your elected master to their knees.

According to Elasticsearch documentation, this feature has 2 purposes:

If a new master node is started which is unaware of the other indices in the cluster, adding the old nodes will cause the old indices to be imported, instead of being deleted. An old index can be added to an existing cluster by copying it to the data/ directory of a new node, starting the node and letting it join the cluster. Once the index has been replicated to other nodes in the cluster, the new node can be shut down and removed.

Elasticsearch behaviour can be controlled using gateway.local.autoimportdangled which is set to yes by default.

However, to avoid any surprise after a master node crash, I prefer to shutdown Elasticsearch, delete all the data directory and start the node as a fresh one. Indeed it might not fill all the cases, but it avoids most conflicts due to a zombi node coming back from the dead.

Photo: Brain, by Adeel Anwer, CC.




From France 2002 to USA 2016

Sun, 09 Oct 2016 07:19:04 +0000

I don’t write about politics often. I stopped being interested by local politics after I dropped out from my politic sciences school back in 2001, with 2 exceptions. The first one was Barack Obama first election, because a black man being elected president of a country having a long story of institutionalised racism was a thing, and this year campaign, because Donald Trump being the Republican nominee rings a very unpleasant bell of deja vu. If you’re into French politics, you might remember our 2002 presidential election. If you’re not, let me tell you a story. French presidential election system differs from the American one by many points. To become a candidate, you have to be endorsed by at least 500 elected people, from small villages mayors to senators. That’s a direct election, so you vote for your favorite candidate. If they have more than 50% voices, they win, otherwise you vote a second time for the 2 winners of the first round. In 2002, everyone was expecting a second round between moderate right President Chirac and his socialist Prime Minister Lionel Jospin. Because in France, you can have a Prime Minister who’s a political opponent of the President. I know this sounds stupid but it happened a couple of times. During the past 5 years, President Chirac had been a terrible president, but he was an outstanding born to win candidate. Jospin had been a despicable Prime Minister and a terrible candidate who did little if no campaign, being sure Chirac unpopularity would lead him to win without an effort. Sunday April the 21st 2002, I was working in a lab at my computer engineering school, when I heard people shouting in the corridor. There was a TV set there, and I could see the devastated face of Prime Minister Lionel Jospin learning he didn’t make to the second round. I couldn’t refrain myself from smiling before I understood it meant President Chirac’s opponent would be Jean-Marie Le Pen. Jean-Marie Le Pen is a right wing, racist, revisionist, populist French politician. He’s well known for saying that gas chamber were “a detail in WWII History”, in other things. And he was one step from being the next French President. 2 weeks later, Chirac was reelected with 82.2% of the votes. But it was too late. Chirac didn’t win because he was the best candidate but to ensure Le Pen would lose. It was not about Chirac’s victory, but Le Pen defeat. And by reaching the second round for the first time in 36 years, Le Pen had already won. From a French point of view, there are lots of similarities between our presidential election and that Clinton VS Trump campaign. I’ve been reading a lot that Trump being elected President would be a shame for America. That’s wrong. As Le Pen reaching the second round was a day shame for France, the USA have already known their day of shame by nominating Trump as the Republican candidate. In 2008, the American presidential election was about knowing whether or not America would elect their first black president, and they did. I remember how happy I was. I woke up at 5AM, logged into Seesmic and recorded a video message to tell my American friends how happy I was for them. Whatever happened during Barack Obama double mandates, Tuesday November the 4th made History. 8 years later, America is about to elect their first woman as a President. 2016 November the 8th should be another historical day, but that’s not what History will remember. History won’t remember this day as the day a woman became President, but as the day people had to vote for or against Donald Trump. From that point of view, the attention, celebrity whore candidate has already won.[...]



How we reindexed 36 billion documents in 5 days within the same Elasticsearch cluster

Thu, 06 Oct 2016 07:00:00 +0000

At Synthesio, we use ElasticSearch at various places to run complex queries that fetch up to 50 million rich documents out of tens of billion in the blink of an eye. Elasticsearch makes it fast and easily scalable where running the same queries over multiple MySQL clusters would take minutes and crash a few servers on the way. Every day, we push Elasticsearch boundaries further, and going deeper and deeper in its internals leads to even more love. Last week, we decided to reindex a 136TB dataset with a brand new mapping. Updating an Elasticsearch mapping on a large index is easy until you need to change an existing field type or delete one. Such updates require a complete reindexing in a separate index created with the right mapping so there was no easy way out for us. The “Blackhole” cluster We’ve called our biggest Elasticsearch cluster “Blackhole”, because that’s exactly what it is: a hot, ready to use datastore being able to contain virtually any amount of data. The only difference with a real blackhole is that we can get our data back at the speed of light. When we designed blackhole, we had to chose between 2 different models. A few huge machines with 4 * 12 core CPU, 512GB of memory and 36 800GB SSD drives, each of them running multiple instances of Elasticsearch. A lot of smaller machines we could scale horizontally as the cluster grows. We opted for the latter since it would make scaling much easier and didn’t require spending too much money upfront. Blackhole runs on 75 physical machines: 2 http nodes, one in each data center behind a HAProxy to load balance the queries. 3 master nodes located in 3 different data center. 70 data nodes into 2 different data center. Each node has quad core Xeon D-1521 CPU running at 2.40GHz and 64GB of memory. The data nodes have a RAID0 over 4*800GB SSD drives with XFS. The whole cluster runs a Systemd less Debian Jessie with a 3.14.32 vanilla kernel. The current version of the cluster has 218,75TB of storage and 4,68TB of memory with 2.39TB being allocated to Elasticsearch heap. That’s all for the numbers. Elasticsearch configuration Blackhole runs ElasticSearch 1.7.5 on Java 1.8. Indexes have 12 shards and 1 replica. We ensure each data center hosts 100% of our data using Elasticsearch rack awareness feature. This setup allows to crash a whole data center without neither data loss nor downtime, which we test every month. All the filtered queries are ran with _cache=false. ElasticSearch caches the filtered queries result in memory, making the whole cluster explode at the first search. Running queries on 100GB shards, this is not something you want to see. When running in production, our configuration is: routing: allocation: node_initial_primaries_recoveries: 20 node_concurrent_recoveries: 20 cluster_concurrent_rebalance: 20 disk: threshold_enabled: true watermark: low: 60% high: 78% index: number_of_shards: 12 number_of_replicas: 1 merge: scheduler: max_thread_count: 8 type: 'concurrent' policy: type: 'tiered' max_merged_segment: 100gb segments_per_tier: 4 max_merge_at_once: 4 max_merge_at_once_explicit: 4 store: type: niofs query: bool: max_clause_count: 10000 action: auto_create_index: false indices: recovery: max_bytes_per_sec: 2048mb fielddata: breaker: limit: 80% cache: size: 25% expire: 1m store: throttle: type: 'none' discovery: zen: minimum_master_nodes: 2 ping: multicast: enabled: false unicast: hosts: ["master01","master02","master03"] threadpool: bulk: queue_size: 3000 type: cached index: queue_si[...]



Happy birthday Dr Frankenstein

Wed, 05 Oct 2016 07:00:00 +0000

(image)

200 years ago was written what would become one of the most important fantastic and at some points philosophic novel, Mary Shelley’s Frankenstein. Despite its old fashion, Victorian era style the Frankenstein is still worth reading and studying at the light of today’s progress and madness in artificial intelligence (AI).

I read Frankenstein the same summer I discovered Asimov’s Robots Cycle, and I can’t help but relate the 2 centuries old novel to The Naked Sun. If you had to read only one of Asimov’s Robots novel, pick up The Naked Sun. It revolves with a paradox in Asimov’s 3 Laws of Robotics where a robot ends killing a human.

Did you ever notice how we often call Frankenstein’s creature by the name of its creator? Doing so, we both identify the monster to the one who created it and push him aside in the shadow so we forget that mankind can engender such a monstrosity. Or maybe the name Frankenstein itself sounds like the one of the monster when Albert Einstein reminds us of the old tongue pulling genius we love to show on our t-shirts, and one of the Bomb fathers.

All comparison aside, Frankenstein still has a few lessons to give. As deep and machine learning has us pushing the boundaries of artificial intelligence, creating virtual personal assistants, support chat bots and even surgeon replacement prototypes, there’s still a chance we end creating a monster. By making his own soul less self out of Science, Dr Frankenstein wanted to become as powerful as his creator. Developing AI, but also trying to push genetics searches further, we’re repeating the same process and the risks are exactly the same. One doesn’t become his own creator without the danger of being destroyed by his own creature. Before we reach this point, there are many questions about pride, ethics, and the existence of the soul we need to answer.




The deadly difference between hiding the symptoms and solving the problem

Tue, 04 Oct 2016 07:00:00 +0000

(image)

There’s a common misconception between solving a problem and hiding the symptoms. The tech world is full of examples both because it’s an easy falling trap and because of the move fast culture.

You have your application being down for short periods several times a day because your database server can’t cope with the load. The fastest workaround is throwing hardware at the problem. Adding new or faster servers, maybe faster disks will stop your application from crashing. Problem solved.

At least from a management point of view.

What you did though was not solving the problem, only make the symptoms disappear. It’s an important step because it gives you the time to focus on the problem before it comes back. To solve the problem, you will probably have to rewrite complex queries obviously written by a bunch of drunken otters, add indexes to billion records tables, write and deploy a caching layer or change your whole technology stack.

Making the symptoms disappear only buys you the time to solve your problem.

That’s not so bad actually, when your management accepts to hear it. A common question I’ve heard over the year was “why should I throw money to fix something that works and won’t bring money?” That’s another easy, deadly trap to fall, until the problem rises again. And “I told you so” won’t solve it.




An Elasticsearch cheat sheet

Mon, 03 Oct 2016 07:00:00 +0000

I’m using Elasticsearch a lot, which brings me to run the same commands again and again to manage my clusters. Even though they’re now all automated in Ansible, I thought it would be interesting to share them here. Mass index deletion with pattern I often have to delete hundreds of indexes at once. Their name usually follow some patterns, which makes batch deletion easier. for index in $(curl -XGET esmaster:9200/_cat/indices | awk ‘/pattern/ {print $3}’); do curl -XDELETE esmaster:9200/$index?master_timeout=120s; done Mass optimize, indexes with the most deleted docs first Lucene, which powers Elasticsearch has a specific behavior when it comes to delete or update documents. Instead of actually deleting or overwriting the data, if flags it as deleted and write a new one. The only way to get rid of a deleted document is to run an optimize on your indexes. This snippet sorts your existing indexes by the number of deleted documents before it runs the optimize. for indice in $(CURL -XGET esmaster:9200/_cat/indices | sort -rk 7 | awk ‘{print $3}’); do curl -XPOST http://esmaster:9200/${indice}/_optimize?max_num_segments=1; done Restart a cluster using rack awareness Using rack awareness allows to split your replicated data evenly between hosts or data center. It’s convenient to restart half of your cluster at once instead of host by host. curl -XPUT ‘host:9200/_cluster/settings’ -d ‘{ “transient” : { “cluster.routing.allocation.enable”: “none” }}’; for host in $(curl -XGET esmaster:9200/_cat/nodeattrs?attr | awk ‘/rack_id/ {print $2}’); do ssh $host service elasticsearch restart; done; sleep60; curl -XPUT ‘host:9200/_cluster/settings’ -d ‘{ “transient” : { “cluster.routing.allocation.enable”: “all }}’ Optimize your cluster restart There’s a simple way to accelerate your cluster restart. Once you’ve brought your masters back, run this snippet. Most of the options are self explanatory: curl -XPUT ‘http://escluster:9200/_cluster/settings' -d ‘{ “transient” : { “cluster.routing.allocation.cluster_concurrent_rebalance”: 20, “indices.recovery.concurrent_streams”: 20, “cluster.routing.allocation.node_initial_primaries_recoveries”: 20, “cluster.routing.allocation.node_concurrent_recoveries”: 20, “indices.recovery.max_bytes_per_sec”: “2048mb”, “cluster.routing.allocation.disk.threshold_enabled” : true, “cluster.routing.allocation.disk.watermark.low” : “90%”, “cluster.routing.allocation.disk.watermark.high” : “98%”, “cluster.routing.allocation.enable”: “primary” } }’ Then, once your cluster is back to yellow, run that one: curl -XPUT ‘http://escluster:9200/_cluster/settings' -d ‘{ “transient” : { “cluster.routing.allocation.enable”: “all” } }’ Get useful information about your cluster Nodes information This snippet gets the most useful information from your Elasticsearch nodes: hostname role (master, data, nothing) free disk space heap used ram used file descriptors used load curl -XGET https://escluster/_cat/nodes?v&h=host,r,d,hc,rc,fdc,l ost r d hc rc fdc l 192.168.1.139 d 1tb 9.4gb 58.2gb 20752 0.20 192.168.1.203 d 988.4gb 16.2gb 59.3gb 21004 0.12 192.168.1.146 d 1tb 14.1gb 59.2gb 20952 0.18 192.168.1.169 d 1tb 14.3gb 58.8gb 20796 0.10 192.168.1.180 d 1tb 16.1gb 60.5gb 21140 0.17 192.168.1.188 d 1tb 9.5gb 59.4gb 20928 0.19 Then, it’s easy to sort the output to get interesting information. Sort by free disk space: curl -XGET https://escluster/_cat/nodes?h=host,r,d,hc,rc,fdc,l | sort -hrk 3 Sort by heap occupancy: curl -XGET https://escluster/_cat/nodes?h=host,r,d,hc,rc,fdc,l | sort -hrk 4 And so on. Indices i[...]