Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memcache issues on production #869

Closed
orta opened this issue Dec 19, 2017 · 8 comments · Fixed by #928
Closed

Memcache issues on production #869

orta opened this issue Dec 19, 2017 · 8 comments · Fixed by #928

Comments

@orta
Copy link
Contributor

orta commented Dec 19, 2017

I took a look at the production logs - it was mostly memcache errors ATM:

2017-12-19T21:03:45.801920+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802046+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802048+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802167+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802169+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802169+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802253+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802254+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802269+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802370+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802372+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802488+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802490+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802490+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.802567+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:45.832608+00:00 app[web.2]: POST / 200 18.407ms 12.53.56.66 "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36"
2017-12-19T21:03:45.838560+00:00 app[web.2]: POST / 200 1145.257ms 54.162.14.161 "node-superagent/2.3.0"
2017-12-19T21:03:45.861809+00:00 app[web.2]: POST / 200 950.535ms 54.162.14.161 "node-superagent/2.3.0"
2017-12-19T21:03:45.881628+00:00 app[web.2]: POST / 200 364.268ms 54.242.23.131 "node-superagent/2.3.0"
2017-12-19T21:03:45.907738+00:00 app[web.2]: POST / 200 233.416ms 54.242.23.131 "node-superagent/2.3.0"
2017-12-19T21:03:46.006729+00:00 app[web.2]: POST / 200 875.840ms 54.242.23.131 "node-superagent/2.3.0"
2017-12-19T21:03:45.840088+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=33977d1f-653f-4ab6-91f9-553713ff8ed1 fwd="54.162.14.161" dyno=web.2 connect=0ms service=1148ms status=200 bytes=13956 protocol=https
2017-12-19T21:03:45.887548+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=f753b98e-6f17-4fea-bafd-d2fd58f5011a fwd="177.237.60.162" dyno=web.1 connect=0ms service=164ms status=200 bytes=3507 protocol=https
2017-12-19T21:03:41.225799+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=3620a04b-53d5-4397-b63e-9a860e53418e fwd="54.234.225.84" dyno=web.1 connect=0ms service=812ms status=200 bytes=22267 protocol=https
2017-12-19T21:03:45.824359+00:00 app[web.1]: POST / 200 8.111ms 12.53.56.66 "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36"
2017-12-19T21:03:45.837616+00:00 app[web.1]: POST / 200 11.081ms 54.234.225.84 "node-superagent/2.3.0"
2017-12-19T21:03:45.873292+00:00 app[web.1]: POST / 200 355.849ms 186.29.242.149 "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36"
2017-12-19T21:03:45.887170+00:00 app[web.1]: POST / 200 154.411ms 177.237.60.162 "Mozilla/5.0 Artsy-Mobile/3.2.7 Eigen/2017.11.01.10/3.2.7 (iPhone; iOS 10.1.1; Scale/3.00) AppleWebKit/601.1.46 (KHTML, like Gecko)"
2017-12-19T21:03:45.873840+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=226187a9-1a9c-4bee-8293-b7355fb48e43 fwd="186.29.242.149" dyno=web.1 connect=1ms service=380ms status=200 bytes=4447 protocol=https
2017-12-19T21:03:43.971547+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=8193121d-67fe-46c0-a124-9539903b81f9 fwd="54.234.225.84" dyno=web.1 connect=0ms service=125ms status=200 bytes=3454 protocol=https
2017-12-19T21:03:45.908929+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=aad50c91-c874-4ac0-b5f7-397608a05910 fwd="54.242.23.131" dyno=web.2 connect=0ms service=235ms status=200 bytes=3555 protocol=https
2017-12-19T21:03:44.608944+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=75d6aeb8-71bf-4675-b884-c9b6810183d1 fwd="54.242.23.131" dyno=web.1 connect=1ms service=221ms status=200 bytes=10489 protocol=https
2017-12-19T21:03:41.892859+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=64252b4b-c474-468a-9d0a-660031f14f4d fwd="54.242.23.131" dyno=web.2 connect=1ms service=1949ms status=200 bytes=9189 protocol=https
2017-12-19T21:03:42.754299+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=56e8f820-9b5b-4f06-8957-e574e470d7b4 fwd="54.162.14.161" dyno=web.1 connect=0ms service=259ms status=200 bytes=11395 protocol=https
2017-12-19T21:03:46.176739+00:00 heroku[router]: at=info method=OPTIONS path="/" host=metaphysics-production.artsy.net request_id=b76cd73f-0d1c-4ffe-ab6b-5f31506734c3 fwd="69.24.145.22" dyno=web.1 connect=0ms service=17ms status=204 bytes=314 protocol=https
2017-12-19T21:03:46.230584+00:00 heroku[router]: at=info method=OPTIONS path="/" host=metaphysics-production.artsy.net request_id=bc86a97a-bd3f-441c-a0cb-069e109b2dd7 fwd="69.24.145.22" dyno=web.2 connect=0ms service=9ms status=204 bytes=314 protocol=https
2017-12-19T21:03:46.105907+00:00 app[web.2]: POST / 200 1.946ms 60.198.40.209 "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36"
2017-12-19T21:03:46.227417+00:00 app[web.2]: POST / 200 2158.869ms 94.72.231.41 "Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"
2017-12-19T21:03:46.228354+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.228737+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.228739+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.228740+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.228740+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.228741+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.228742+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.228742+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.295219+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=86969c17-8c03-4de4-83cf-9de5e05c4f26 fwd="69.24.145.22" dyno=web.1 connect=0ms service=21ms status=200 bytes=1965 protocol=https
2017-12-19T21:03:45.576355+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=b39c420f-610b-4e9b-8f0d-43932efef398 fwd="54.234.225.84" dyno=web.1 connect=0ms service=57ms status=200 bytes=3364 protocol=https
2017-12-19T21:03:46.223750+00:00 heroku[router]: at=info method=OPTIONS path="/" host=metaphysics-production.artsy.net request_id=b5c34f98-2ec5-458b-b088-45a26b38bab0 fwd="69.24.145.22" dyno=web.1 connect=0ms service=1ms status=204 bytes=314 protocol=https
2017-12-19T21:03:46.285068+00:00 heroku[router]: at=info method=OPTIONS path="/" host=metaphysics-production.artsy.net request_id=4a3191b3-a965-4271-b472-513de28d44b4 fwd="70.24.239.18" dyno=web.2 connect=0ms service=1ms status=204 bytes=314 protocol=https
2017-12-19T21:03:46.309686+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=86969c17-8c03-4de4-83cf-9de5e05c4f26 fwd="69.24.145.22" dyno=web.1 connect=1ms service=35ms status=200 bytes=6835 protocol=https
2017-12-19T21:03:46.204881+00:00 heroku[router]: at=info method=OPTIONS path="/" host=metaphysics-production.artsy.net request_id=66506f2d-c6e8-4694-b095-6cbffb159080 fwd="54.210.161.247" dyno=web.2 connect=0ms service=1ms status=204 bytes=331 protocol=https
2017-12-19T21:03:46.214457+00:00 app[web.1]: POST / 200 118.046ms 54.210.161.247 "Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.1.1 Safari/538.1 Artsy/Reflection"
2017-12-19T21:03:46.217976+00:00 app[web.1]: POST / 200 1.411ms 80.69.135.153 "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36"
2017-12-19T21:03:46.281455+00:00 app[web.1]: POST / 200 296.586ms 54.242.23.131 "node-superagent/2.3.0"
2017-12-19T21:03:46.294998+00:00 app[web.1]: POST / 200 9.741ms 69.24.145.22 "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.109 Safari/537.36"
2017-12-19T21:03:46.309256+00:00 app[web.1]: POST / 200 23.347ms 69.24.145.22 "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.109 Safari/537.36"
2017-12-19T21:03:46.382376+00:00 app[web.1]: POST / 200 209.621ms 54.210.161.247 "Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.1.1 Safari/538.1 Artsy/Reflection"
2017-12-19T21:03:46.290125+00:00 heroku[router]: at=info method=OPTIONS path="/" host=metaphysics-production.artsy.net request_id=f28afbc0-0630-4131-94b5-4066dae0f2e4 fwd="70.24.239.18" dyno=web.2 connect=0ms service=1ms status=204 bytes=314 protocol=https
2017-12-19T21:03:46.382867+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=375cae02-b8e8-4b1f-b58b-45a8a18ccd5b fwd="54.210.161.247" dyno=web.1 connect=0ms service=257ms status=200 bytes=627 protocol=https
2017-12-19T21:03:41.530308+00:00 heroku[router]: at=info method=OPTIONS path="/" host=metaphysics-production.artsy.net request_id=58b95045-5ffd-4fc2-af5f-d7d511d7852c fwd="12.53.56.66" dyno=web.1 connect=1ms service=3ms status=204 bytes=314 protocol=https
2017-12-19T21:03:41.708645+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=d1cc0467-8368-4df3-9b38-8459509853fb fwd="12.53.56.66" dyno=web.1 connect=1ms service=164ms status=200 bytes=2881 protocol=https
2017-12-19T21:03:44.921200+00:00 heroku[router]: at=info method=POST path="/" host=metaphysics-production.artsy.net request_id=9af96c2b-1eef-4047-b4c4-be18350bea5a fwd="184.179.104.240" dyno=web.1 connect=0ms service=3ms status=200 bytes=259 protocol=https
2017-12-19T21:03:46.372304+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372431+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372434+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372434+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372435+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372436+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372558+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372642+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372644+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372645+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372763+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372764+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372765+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.372924+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.373022+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.373028+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
2017-12-19T21:03:46.373240+00:00 app[web.2]: MemJS: Server <117374.59c187.us-east-5.heroku.prod.memcachier.com:11211> failed after (2) retries with error - read ECONNRESET
@alloy
Copy link
Contributor

alloy commented Dec 19, 2017

Yeah these have always been the same. @craigspaeth Had said that we could probably easily switch to e.g. redis.

@izakp
Copy link
Contributor

izakp commented Jan 8, 2018

Saw this again this morning. Fixed by restarting the app, meaning it was not on Memcachier's side, and most likely in the error handling of MemJS. I expect the appropriate fix would be to make sure MemJS reinstantiates (i.e. fully closes and opens a new socket) when it gets these errors, or even to just bail out and let the Heroku dyno process supervisor restart the app.

@izakp
Copy link
Contributor

izakp commented Jan 8, 2018

@alloy did update MemJS to 0.10.0 a year ago in response to this issue, but it's still open! :/ Is it possible to catch this error and call the memjs.Client.create() constructor fresh?

@alloy
Copy link
Contributor

alloy commented Jan 8, 2018

After discussing with @izakp we concluded that we should just make the switch back to Redis, because:

  • iirc there’s more offerings in redis services and node libraries
  • operationally speaking, redis has @izakp’s favour

@orta
Copy link
Contributor Author

orta commented Jan 9, 2018

@izakp
Copy link
Contributor

izakp commented Jan 11, 2018

@orta I don't think we should conflate this with other memory issues

@izakp
Copy link
Contributor

izakp commented Jan 11, 2018

@alloy so right now there are a couple of things lined up for us in Platform, but once we update to Kubernetes 1.9 and add additional instances to plan for Metaphysics' memory usage, I think we can move forward on this. You will then be able to use AWS ElasticCache for Memcached or Redis, accessed via a private network, and so can use the leading drivers.

@alloy
Copy link
Contributor

alloy commented Jan 11, 2018

@izakp Awesome! 👌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants