heroku

Latest News or Articles for heroku Website.

today we’re happy to announce that the sydney, australia region is now generally available for use with heroku private spaces. sydney joins virginia, oregon, frankfurt, and tokyo as regions where private spaces can be created by any heroku enterprise user. developers can now deploy heroku apps closer to customers in the asia-pacific area to reduce latency and take advantage of the advanced network & trust controls of spaces to ensure sensitive data stays protected.

usage

to create a private space in sydney, select the spaces tab in heroku dashboard in heroku enterprise, then click the “new space” button and choose “sydney, australia” from the the space region dropdown.

after a private space in sydney is created, heroku apps can be created inside it as normal. heroku postgres, redis, and kafka are also available in sydney as are a variety of third-party add-ons.

better latency for asia-pacific

prior to this release, a heroku enterprise developer, from anywhere in the world, could create apps in spaces in virginia, oregon, tokyo, or frankfurt, and have them be available to any user in the world. the difference with this release is that apps (and heroku data services) can be created and hosted in the sydney region. this will bring far faster access for developers and users of heroku apps across the asia-pacific area. time-to-first-byte for a user in australia accessing an app deployed in a private space in the sydney region is approximately four times better than for that same user accessing an app deployed in a private space in tokyo (approx 0.1s vs 0.4s).

extending the vision of heroku private spaces

a private space, available as part of heroku enterprise, is a network isolated group of apps and data services with a dedicated runtime environment, provisioned to heroku in a geographic region you specify. with spaces you can build modern apps with the powerful heroku developer experience and get enterprise-grade secure network topologies. this enables your heroku applications to securely connect to on-premise systems on your corporate network and other cloud services, including salesforce.

with the ga of the sydney region, we now bring those isolation, security, and network benefits to heroku apps and data services in the asia-pacific region.

learn more

all heroku enterprise customers can immediately begin to create private spaces in sydney and deploy apps there. we’re excited by the possibilities private spaces opens up for developers in australia and asia-pacific more broadly - if you want more information, or are an existing heroku customer and have questions on using and configuring spaces, please contact us.


Information

we’re excited to announce that heroku autoscaling is now generally available for apps using web dynos.

we’ve always made it seamless and simple to scale apps on heroku - just move the slider. but we want to go further, and help you in the face of unexpected demand spikes or intermittent activity. part of our core mission is delivering a first-class operational experience that provides proactive notifications, guidance, and—where appropriate—automated responses to particular application events. today we take another big step forward in that mission with the introduction of autoscaling.

autoscaling makes it effortless to meet demand by horizontally scaling your web dynos based on what’s most important to your end users: responsiveness. to measure responsiveness, heroku autoscaling uses your app’s 95th percentile (p95) response time, an industry-standard metric for assessing user experience. the p95 response time is the number of milliseconds that only 5% of your app’s response times exceed. you can view your app’s p95 response time in the application metrics response time plot. using p95 response time as the trigger for autoscaling ensures that the vast majority of your users experience good performance, without overreacting to performance outliers.

autoscaling is easy to set up and use, and it recommends a p95 threshold based on your app’s past 24 hours of response times. response-based autoscaling ensures that your web dyno formation is always sized for optimal efficiency, while capping your costs based on limits you set. autoscaling is currently included at no additional cost for apps using performance and private web dynos.

autoscaling_demo

get started

from heroku dashboard navigate to the resources tab to enable autoscaling for your web dynos: enable_as

from the web dyno formation dialog set the desired upper and lower limit for your dyno range. with heroku autoscaling you won’t be surprised by unexpected dyno fees. the cost estimator shows the maximum possible web dyno cost when for using autoscaling, expressed in either dyno units for heroku enterprise organizations, or dollars.

next, enter the desired p95 response time in milliseconds. to make it easy to select a meaningful p95 setting, the median p95 latency for the past 24 hours is provided as guidance. by enabling email notifications we’ll let you know if the scaling demand reaches your maximum dyno setting, so you won’t miss a customer request.

config0

monitoring autoscaling

you can monitor your autoscaling configuration and scaling events the events table on application metrics and view the corresponding impact on application health.

autoscalingblogevents

when to use autoscaling

autoscaling is useful for when demand on web resources is variable. however, it is not meant to be a panacea for all application health issues that result in latency. for example, it is possible that lengthy response times may be due to a downstream resource, such as a slow database query. in this case scaling web dynos in the absence of sufficient database resources or query optimization could result in exacerbation of the problem.

in order to identify whether autoscaling is appropriate for your environment we recommend that you load test prior to implementing autoscaling in production, and use threshold alerting to monitor your p95 response times and error rates. if you plan to load test please refer to our load testing guidelines for support notification requirements. as with manual scaling, you may need to tune your downstream components in anticipation of higher request volumes. additional guidance on optimization is available in the scaling documentation.

how it works

heroku's autoscaling model employs little's law to determine the optimal number of web dynos needed to maintain your current request throughput while keeping web request latency within your specified p95 response time threshold. the deficit or excess of dynos is measured as ldiff, which takes into consideration the past hour of traffic. for example in the following simulation, at time point 80 minutes there is a spike in response time (latency) and a corresponding dip in ldiff, indicating that there is a deficit in the existing number of web dynos with respect to the current throughput and response time target. the platform will add an additional web dyno and reassess the ldiff. this process will be repeated until the p95 response time is within your specified limit or you have reached your specified upper dyno limit. a similar approach is used for scaling in.

autoscalingsimga

find out more

autoscaling has been one of your top requested features when it comes to operational experience - thank you to everyone who gave us feedback during the beta and in our recent ops survey. for more details on autoscaling refer to the dyno scaling documentation. learn more about heroku's other operational features here.

if there’s an autoscaling enhancement or metrics-driven feature you would like to see, you can reach us at [email protected].


Information

as we begin 2017, we want to thank you for supporting heroku. your creativity and innovation continues to inspire us, and pushed us to deliver even more new products and features in 2016. we especially want to thank everyone who helped us by beta testing, sharing heroku with others, and providing feedback. here are the highlights of what became generally available in 2016.

advancing the developer experience

heroku pipelines

a new way to structure, manage and visualize continuous delivery.

heroku review apps

test code at a shareable url using disposable heroku apps that spin up with each github pull request.

free ssl for apps on paid dynos

get ssl encryption on custom domains for free on apps that use paid dynos.

the new heroku cli

take advantage of the cli’s faster performance and new usability features.

heroku teams

powerful collaboration, administration and centralized billing capabilities to build and run more effective development teams.

flexible dyno hours

run a free app 24/7, or many apps on an occasional basis, using a pool of account-based free dyno hours.

threshold alerting

let the platform keep your apps healthy: get proactive alerts based on app responsiveness and error rates.

session affinity

route requests from a given browser to the same dyno, so apps with ‘sticky sessions’ can take advantage of heroku’s flexible scaling.

build data-centric apps on heroku

apache kafka on heroku

build data-intensive apps with ease using the leading open source solution for managing event streams.

postgresql 9.6

speed up sequential scans for faster analytics applications, create indexes without blocking writes on tables in production apps, and more.

heroku external objects

read and write postgres data from salesforce so you can integrate application data in heroku with business processes inside salesforce.

heroku connect apis

build repeatable automation for configuring heroku connect environments, managing connections across salesforce orgs, and integrating with existing operational systems.

heroku enterprise: advanced trust controls & scale for large organizations

heroku private spaces

have your own private heroku as a service, with configurable network boundaries, global regions, and private data services for your most demanding enterprise apps.

sso for heroku

use saml 2.0 identity providers like salesforce identity, ping and okta for single sign-on to heroku enterprise.

add-on controls

standardize the add-ons your team uses by whitelisting them within your heroku enterprise organization.

onwards!

we look forward to continuing our innovation across developer experience, data services, collaboration, and enterprise controls to help you build more amazing applications. have a product or feature you'd like to see in 2017? send us your feedback.

p.s. get your heroku created ascii artwork here and here.


Information

the ruby maintainers continued their annual tradition by gifting us a new ruby version to celebrate the holiday: ruby 2.4 is now available and you can try it out on heroku.

ruby 2.4 brings some impressive new features and performance improvements to the table, here are a few of the big ones:

binding#irb

have you ever used p or puts to get the value of a variable in your code? if you’ve been writing ruby the odds are pretty good that you have. the alternative repl pry (http://pryrepl.org/) broke many of us of this habit, but installing a gem to get a repl during runtime isn’t always an option, or at least not a convenient one.

enter binding.irb, a new native runtime invocation for the irb repl that ships with ruby. now you can simply add binding.irb to your code to open an irb session and have a look around:

# ruby-2.4.0 class superconfusing def what_is_even_happening_right_now @x = @xy[:y] ** @x binding.irb # open a repl here to examine @x, @xy, # and possibly your life choices end end 

one integer to rule them all

ruby previously used 3 classes to handle integers: the abstract super class integer, the fixnum class for small integers and the bignum class for large integers. you can see this behavior yourself in ruby 2.3:

# ruby-2.3.3 irb> 1.class # => fixnum irb> (2**100).class # => bignum irb> fixnum.superclass # => integer irb> bignum.superclass # => integer 

ruby 2.4 unifies the fixnum and bignum classes into a single concrete class integer:

# ruby-2.4.0 irb> 1.class # => integer irb> (2**100).class # => integer 

why did we ever have two classes of integer?

to improve performance ruby stores small numbers in a single native machine word whenever possible, either 32 or 64 bits in length depending on your processor. a 64-bit processor has a 64-bit word length; the 64 in this case describes the size of the registers on the processor.

the registers allow the processor to handle simple arithmetic and logical comparisons, for numbers up to the word size, by itself; which is much faster than manipulating values stored in ram.

on my laptop it's more than twice as fast for me to add 1 to a fixnum a million times than it is to do the same with a bignum:

# ruby-2.3.3 require "benchmark" fixnum = 2**40 bignum = 2**80 n = 1_000_000 benchmark.bm do |x| x.report("adding #{fixnum.class}:") { n.times { fixnum + 1 } } x.report("adding #{bignum.class}:") { n.times { bignum + 1 } } end # => # user system total real # adding fixnum: 0.190000 0.010000 0.200000 ( 0.189790) # adding bignum: 0.460000 0.000000 0.460000 ( 0.471123) 

when a number is too big to fit in a native machine word ruby will store that number differently, automatically converting it to a bignum behind the scenes.

how big is too big?

well, that depends. it depends on the processor you’re using, as we’ve discussed, but it also depends on the operating system and the ruby implementation you’re using.

wait it depends on my operating system?

yes, different operating systems use different c data type models.

when processors first started shipping with 64-bit registers it became necessary to augment the existing data types in the c language, to accommodate larger register sizes and take advantage of performance increases.

unfortunately, the c language doesn't provide a mechanism for adding new fundamental data types. these augmentations had to be accomplished via alternative data models like lp64, ilp64 and llp64.

ll-what now?

lp64, il64 and llp64 are some of the data models used in the c language. this is not an exhaustive list of the available c data models but these are the most common.

the first few characters in each of these acronyms describe the data types they affect. for example, the "l" and "p" in the lp64 data model stand for long and pointer, because lp64 uses 64-bits for those data types.

these are the sizes of the relevant data types for these common data models:

| | int | long | long long | pointer | |-------|-----|------|-----------|---------| | lp64 | 32 | 64 | na | 64 | | ilp64 | 64 | 64 | na | 64 | | llp64 | 32 | 32 | 64 | 64 | 

almost all unix and linux implementations use lp64, including os x. windows uses llp64, which includes a new long long type, just like long but longer.

so the maximum size of a fixnum depends on your processor and your operating system, in part. it also depends on your ruby implementation.

fixnum size by ruby implementation

| fixnum range | min | max | |----------------------|-----------------|------------------| | 32-bit cruby (ilp32) | -2**30 | 2**30 - 1 | | 64-bit cruby (llp64) | -2**30 | 2**30 - 1 | | 64-bit cruby (lp64) | -2**62 | 2**62 - 1 | | jruby | -2**63 | 2**63 - 1 | 

the range of fixnum can vary quite a bit between ruby implementations.

in jruby for example a fixnum is any number between -263 and 263-1. cruby will either have fixnum values between -230 and 230-1 or -262 and 262-1, depending on the underlying c data model.

your numbers are wrong, you're not using all the bits

you're right, even though we have 64 bits available we're only using 62 of them in cruby and 63 in jruby. both of these implementations use two's complement integers, binary values that use one of the bits to store the sign of the number. so that accounts for one of our missing bits, how about that other one?

in addition to the sign bit, cruby uses one of the bits as a fixnum_flag, to tell the interpreter whether or not a given word holds a fixnum or a reference to a larger number. the sign bit and the flag bit are at opposite ends of the 64-bit word, and the 62 bits left in the middle are the space we have to store a number.

in jruby we have 63 bits to store our fixnum, because jruby stores both fixnum and bignum as 64-bit signed values; they don't need a fixnum_flag.

why are they changing it now?

the ruby team feels that the difference between a fixnum and a bignum is ultimately an implementation detail, and not something that needs to be exposed as part of the language.

using the fixnum and bignum classes directly in your code can lead to inconsistent behavior, because the range of those values depends on so many things. they don't want to encourage you to depend on the ranges of these different integer types, because it makes your code less portable.

unification also significantly simplifies ruby for beginners. when you're teaching your friends ruby you longer need to explain the finer points of 64-bit processor architecture.

rounding changes

in ruby float#round has always rounded floating point numbers up for decimal values greater than or equal to .5, and down for anything less, much as you learned to expect in your arithmetic classes.

# ruby-2.3.3 irb> (2.4).round # => 2 irb> (2.5).round # => 3 

during the development of ruby 2.4 there was a proposal to change this default rounding behavior to instead round to the nearest even number, a strategy known as half to even rounding, or gaussian rounding (among many other names).

# ruby-2.4.0-preview3 irb> (2.4).round # => 2 irb> (2.5).round # => 2 irb> (3.5).round # => 4 

the half to even strategy would only have changed rounding behavior for tie-breaking; numbers that are exactly halfway (.5) would have been rounded down for even numbers, and up for odd numbers.

why would anyone do that?

the gaussian rounding strategy is commonly used in statistical analysis and financial transactions, as the resulting values less significantly alter the average magnitude for large sample sets.

as an example let's generate a large set of random values that all end in .5:

# ruby-2.3.3 irb> halves = array.new(1000) { rand(1..1000) + 0.5 } # => [578.5...120.5] # 1000 random numbers between 1.5 and 1000.5 

now we'll calculate the average after forcing our sum to be a float, to ensure we don't end up doing integer division:

# ruby-2.3.3 irb> average = halves.inject(:+).to_f / halves.size # => 510.675 

the actual average of all of our numbers is 510.675, so the ideal rounding strategy should give us a rounded average be as close to that number as possible.

let's see how close we get using the existing rounding strategy:

# ruby-2.3.3 irb> round_up_average = halves.map(&:round).inject(:+).to_f / halves.size # => 511.175 irb> (average - round_up_average).abs # => 0.5 

we're off the average by 0.5 when we consistently round ties up, which makes intuitive sense. so let's see if we can get closer with gaussian rounding:

# ruby-2.3.3 irb> rounded_halves = halves.map { |n| n.to_i.even? ? n.floor : n.ceil } # => [578...120] irb> gaussian_average = rounded_halves.inject(:+).to_f / halves.size # => 510.664 irb> (average - gaussian_average).abs # => 0.011000000000024102 

it would appear we have a winner. rounding ties to the nearest even number brings us more than 97% closer to our actual average. for larger sample sets we can expect the average from gaussian rounding to be almost exactly the actual average.

this is why gaussian rounding is the recommended default rounding strategy in the ieee standard for floating-point arithmetic (ieee 754).

so ruby decided to change it because of ieee 754?

not exactly, it actually came to light because gaussian rounding is already the default strategy for the kernel#sprintf method, and an astute user filed a bug on ruby: "rounding modes inconsistency between round versus sprintf".

here we can clearly see the difference in behavior between kernel#sprintf and float#round:

# ruby 2.3.3 irb(main):001:0> sprintf('%1.0f', 12.5) # => "12" irb(main):002:0> (12.5).round # => 13 

the inconsistency in this behavior prompted the proposed change, which actually made it into one of the ruby 2.4 preview versions, ruby-2.4.0-preview3:

# ruby 2.4.0-preview3 irb(main):006:0> sprintf('%1.0f', 12.5) # => "12" irb(main):007:0> 12.5.round # => 12 

in ruby-2.4.0-preview3 rounding with either kernel#sprintf or float#round will give the same result.

ultimately matz decided this fix should not alter the default behavior of float#round when another user reported a bug in rails: "breaking change in how #round works".

the ruby team decided to compromise and add a new keyword argument to float#round to allow us to set alternative rounding strategies ourselves:

# ruby 2.4.0-rc1 irb(main):001:0> (2.5).round # => 3 irb(main):008:0> (2.5).round(half: :down) # => 2 irb(main):009:0> (2.5).round(half: :even) # => 2 

the keyword argument :half can take either :down or :even and the default behavior is still to round up, just as it was before.

why preview versions are not for production

interestingly before the default rounding behavior was changed briefly for 2.4.0-preview3 there was an unusual kernel#sprintf bug in 2.4.0-preview2:

# ruby 2.4.0-preview2 irb> numbers = (1..20).map { |n| n + 0.5 } # => => [1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5, 17.5, 18.5, 19.5, 20.5] irb> numbers.map { |n| sprintf('%1.0f', n) } # => ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "12", "14", "14", "16", "16", "18", "18", "20", "20"] 

in this example kernel#sprintf appears to be rounding numbers less than 12 up as though it was using the float#round method's default behavior, which was still in place at this point.

the preview releases before and after 2.4.0-preview2, both 2.4.0-preview1 and 2.4.0-preview3, show the expected sprintf behavior, consistent with ruby-2.3.3:

# ruby 2.4.0-preview1 irb> numbers.map { |n| sprintf('%1.0f', n) } # => ["2", "2", "4", "4", "6", "6", "8", "8", "10", "10", "12", "12", "14", "14", "16", "16", "18", "18", "20", "20"] # ruby 2.4.0-preview3 irb> numbers.map { |n| sprintf('%1.0f', n) } # => ["2", "2", "4", "4", "6", "6", "8", "8", "10", "10", "12", "12", "14", "14", "16", "16", "18", "18", "20", "20"] 

i discovered this by accident while researching this article and started digging through the 2.4.0-preview2 changes to see if i could identify the cause. i found this commit from nobu:

commit 295f60b94d5ff6551fab7c55e18d1ffa6a4cf7e3 author: nobu <[email protected]> date: sun jul 10 05:27:27 2016 +0000 util.c: round nearly middle value * util.c (ruby_dtoa): [experimental] adjust the case that the float value is close to the exact but unrepresentable middle value of two values in the given precision, as r55604. git-svn-id: svn+ssh:[email protected] b2dd03c8-39d4-4d8f-98ff-823fe69b080e 

kernel#sprintf accuracy in ruby 2.4

this was an early effort by nobu to handle cases where floating point numbers rounded inconsistently with kernel#sprintf in ruby-2.3.3 (and before):

# ruby-2.3.3 irb> numbers = (0..9).map { |n| "5.0#{n}5".to_f } # => [5.005, 5.015, 5.025, 5.035, 5.045, 5.055, 5.065, 5.075, 5.085, 5.095] irb> numbers.map { |n| sprintf("%.2f", n) } # => ["5.00", "5.01", "5.03", "5.04", "5.04", "5.05", "5.07", "5.08", "5.08", "5.09"] 

in the example above notice that 5.035 and 5.045 both round to 5.04. no matter what strategy kernel#sprintf is using this is clearly unexpected. the cause turns out to be the unseen precision beyond our representations.

not to worry though, the final version of nobu's fixes resolves this issue, and it will be available in ruby 2.4.

kernel#sprintf will now consistently apply half to even rounding:

# ruby-2.4.0-rc1 irb> numbers = (0..9).map { |n| "5.0#{n}5".to_f } # => [5.005, 5.015, 5.025, 5.035, 5.045, 5.055, 5.065, 5.075, 5.085, 5.095] irb> numbers.map { |n| sprintf("%.2f", n) } # => ["5.00", "5.02", "5.02", "5.04", "5.04", "5.06", "5.06", "5.08", "5.08", "5.10"] 

better hashes

ruby 2.4 introduces some significant changes to the hash table backing ruby's hash object. these changes were prompted by vladimir makarov when he submitted a patch to ruby's hash table earlier this year.

if you have a couple of hours to spare that issue thread is an entertaining read, but on the off-chance you're one of those busy developers i'll go through the major points here. first we need to cover some ruby hash basics.

if you're already an expert on ruby hash internals feel free to skip ahead and read about the specific hash changes in ruby 2.4.

how ruby implements hash

let's imagine for a moment that we have a severe case of "not invented here" syndrome, and we've decided to make our own hash implementation in ruby using arrays. i'm relatively certain we're about to do some groundbreaking computer science here so we'll call our new hash turbohash, as it's certain to be faster than the original:

# turbo_hash.rb class turbohash attr_reader :table def initialize @table = [] end end 

we'll use the @table array to store our table entries. we gave ourselves a reader to access it so it's easy to peek inside our hash.

we're definitely going to need methods to set and retrieve elements from our revolutionary hash so let's get those in there:

# turbo_hash.rb class turbohash # ... def [](key) # remember our entries look like this: # [key, value] find(key).last end def find(key) # enumerable#find here will return the first entry that makes # our block return true, otherwise it returns nil. @table.find do |entry| key == entry.first end end def []=(key, value) entry = find(key) if entry # if we already stored it just change the value entry[1] = value else # otherwise add a new entry @table << [key, value] end end end 

excellent, we can set and retrieve keys. it's time to setup some benchmarking and admire our creation:

require "benchmark" legacy = hash.new turbo = turbohash.new n = 10_000 def set_and_find(target) target = rand target[key] = rand target[key] end benchmark.bm do |x| x.report("hash: ") { n.times { set_and_find(legacy) } } x.report("turbohash: ") { n.times { set_and_find(turbo) } } end # user system total real # hash: 0.010000 0.000000 0.010000 ( 0.009026) # turbohash: 45.450000 0.070000 45.520000 ( 45.573937) 

well that could have gone better, our implementation is about 5000 times slower than ruby's hash. this is obviously not the way hash is actually implemented.

in order to find an element in @table our implementation traverses the entire array on each iteration; towards the end we're checking nearly 10k entries one at a time.

so let's come up with something better. the iteration is killing us, if we can find a way to index instead of iterating we'll be way ahead.

if we knew our keys were always going to be integers we could just store the values at their indexes inside of @table and look them up by their indexes later.

the issue of course is that our keys can be anything, we're not building some cheap knock-off hash that can only take integers.

we need a way to turn our keys into numbers in a consistent way, so "some_key" will give us the same number every time, and we can regenerate that number to find it again later.

it turns out that the object#hash is perfect for this purpose:

irb> "some_key".hash # => 3031662902694417109 irb> "some_other_key".hash # => -3752665667844152731 irb> "some_key".hash # => 3031662902694417109 

the object#hash will return unique(ish) integers for any object in ruby, and you'll get the same number back every time you run it again with an object that's "equal" to the previous object.

for example, every time you create a string in ruby you'll get a unique object:

irb> a = "some_key" # => "some_key" irb> a.object_id # => 70202008509060 irb> b = "some_key" # => "some_key" irb> b.object_id # => 70202008471340 

these are clearly distinct objects, but they will have the same object#hash return value because a == b:

irb> a.hash # => 3031662902694417109 irb> b.hash # => 3031662902694417109 

these hash return values are huge and sometimes negative, so we're going to use the remainder after dividing by some small number as our index instead:

irb> a.hash % 11 # => 8 

we can use this new number as the index in @table where we store the entry. when we want to look up an item later we can simply repeat the operation to know exactly where to find it.

this raises another issue however, our new indexes are much less unique than they were originally; they range between 0 and 10. if we store more than 11 items we are certain to have collisions, overwriting existing entries.

rather than storing the entries directly in the table we'll put them inside arrays called "bins". each bin will end up having multiple entries, but traversing the bins will still be faster than traversing the entire table.

armed with our new indexing system we can now make some improvements to our turbohash.

our @table will hold a collection of bins and we'll store our entries in the bin that corresponds to key.hash % 11:

# turbo_hash.rb class turbohash num_bins = 11 attr_reader :table def initialize # we know our indexes will always be between 0 and 10 # so we need an array of 11 bins. @table = array.new(num_bins) { [] } end def [](key) find(key).last end def find(key) # now we're searching inside the bins instead of the whole table bin_for(key).find do |entry| key == entry.first end end def bin_for(key) # since hash will always return the same thing we know right where to look @table[index_of(key)] end def index_of(key) # a pseudorandom number between 0 and 10 key.hash % num_bins end def []=(key, value) entry = find(key) if entry entry[1] = value else # store new entries in the bins bin_for(key) << [key, value] end end end 

let's benchmark our new and improved implementation:

 user system total real hash: 0.010000 0.000000 0.010000 ( 0.012918) turbohash: 3.800000 0.010000 3.810000 ( 3.810126) 

so that's pretty good i guess, using bins decreased the time for turbohash by more than 90%. those sneaky ruby maintainers are still crushing us though, let's see what else we can do.

it occurs to me that our benchmark is creating 10_000 entries but we only have 11 bins. each time we iterate through a bin we're actually going over a pretty large array now.

let's check out the sizes on those bins after the benchmark finishes:

bin: relative size: length: ---------------------------------------- 0 +++++++++++++++++++ (904) 1 ++++++++++++++++++++ (928) 2 +++++++++++++++++++ (909) 3 ++++++++++++++++++++ (915) 4 +++++++++++++++++++ (881) 5 +++++++++++++++++++ (886) 6 +++++++++++++++++++ (876) 7 ++++++++++++++++++++ (918) 8 +++++++++++++++++++ (886) 9 ++++++++++++++++++++ (952) 10 ++++++++++++++++++++ (945) 

that's a nice even distribution of entries but those bins are huge. how much faster is turbohash if we increase the number of bins to 19?

 user system total real hash: 0.020000 0.000000 0.020000 ( 0.021516) turbohash: 2.870000 0.070000 2.940000 ( 3.007853) bin: relative size: length: ---------------------------------------- 0 ++++++++++++++++++++++ (548) 1 +++++++++++++++++++++ (522) 2 ++++++++++++++++++++++ (547) 3 +++++++++++++++++++++ (534) 4 ++++++++++++++++++++ (501) 5 +++++++++++++++++++++ (528) 6 ++++++++++++++++++++ (497) 7 +++++++++++++++++++++ (543) 8 +++++++++++++++++++ (493) 9 ++++++++++++++++++++ (500) 10 +++++++++++++++++++++ (526) 11 ++++++++++++++++++++++ (545) 12 +++++++++++++++++++++ (529) 13 ++++++++++++++++++++ (514) 14 ++++++++++++++++++++++ (545) 15 ++++++++++++++++++++++ (548) 16 +++++++++++++++++++++ (543) 17 ++++++++++++++++++++ (495) 18 +++++++++++++++++++++ (542) 

we gained another 25%! that's pretty good, i bet it gets even better if we keep making the bins smaller. this is a process called rehashing, and it's a pretty important part of a good hashing strategy.

let's cheat and peek inside st.c to see how ruby handles increasing the table size to accomodate more bins:

/* https://github.com/ruby/ruby/blob/ruby_2_3/st.c#l38 */ #define st_default_max_density 5 #define st_default_init_table_size 16 

ruby's hash table starts with 16 bins. how do they get away with 16 bins? weren't we using prime numbers to reduce collisions?

we were, but using prime numbers for hash table size is really just a defense against bad hashing functions. ruby has a much better hashing function today than it once did, so the ruby maintainers stopped using prime numbers in ruby 2.2.0.

what's this other default max density number?

the st_default_max_density defines the average maximum number of entries ruby will allow in each bin before rehashing: choosing the next largest power of two and recreating the hash table with the new, larger size.

you can see the conditional that checks for this in the add_direct function from st.c:

/* https://github.com/ruby/ruby/blob/ruby_2_3/st.c#l463 */ if (table->num_entries > st_default_max_density * table->num_bins) {...} 

ruby's hash table tracks the number of entries as they're added using the num_entries value on table. this way ruby doesn't need to count the entries to decide if it's time to rehash, it just checks to see if the number of entries is more than 5 times the number of bins.

let's implement some of the improvements we stole from ruby to see if we can speed up turbohash:

class turbohash starting_bins = 16 attr_accessor :table def initialize @max_density = 5 @entry_count = 0 @bin_count = starting_bins @table = array.new(@bin_count) { [] } end def grow # use bit shifting to get the next power of two and reset the table size @bin_count = @bin_count << 1 # create a new table with a much larger number of bins new_table = array.new(@bin_count) { [] } # copy each of the existing entries into the new table at their new location, # as returned by index_of(key) @table.flatten(1).each do |entry| new_table[index_of(entry.first)] << entry end # finally we overwrite the existing table with our new, larger table @table = new_table end def full? # our bins are full when the number of entries surpasses 5 times the number of bins @entry_count > @max_density * @bin_count end def [](key) find(key).last end def find(key) bin_for(key).find do |entry| key == entry.first end end def bin_for(key) @table[index_of(key)] end def index_of(key) # use @bin_count because it now changes each time we resize the table key.hash % @bin_count end def []=(key, value) entry = find(key) if entry entry[1] = value else # grow the table whenever we run out of space grow if full? bin_for(key) << [key, value] @entry_count += 1 end end end 

so what's the verdict?

 user system total real hash: 0.010000 0.000000 0.010000 ( 0.012012) turbohash: 0.130000 0.010000 0.140000 ( 0.133795) 

we lose. even though our turbohash is now 95% faster than our last version, ruby still beats us by an order of magnitude.

all things considered, i think turbohash fared pretty well. i'm sure there are some ways we could further improve this implementation but it's time to move on.

at long last we have enough background to explain what exactly is about to nearly double the speed of ruby hashes.

what actually changed

speed! ruby 2.4 hashes are significantly faster. the changes introduced by vladimir makarov were designed to take advantage of modern processor caching improvements by focusing on data locality.

this implementation speeds up the ruby hash table benchmarks in average by more 40% on intel haswell cpu.

https://github.com/ruby/ruby/blob/trunk/st.c#l93

oh good! what?

processors like the intel haswell series use several levels of caching to speed up operations that reference the same region of memory.

when the processor reads a value from memory it doesn't just take the value it needs; it grabs a large piece of memory nearby, operating on the assumption that it is likely going to be asked for some of that data in the near future.

the exact algorithms processors use to determine which bits of memory should get loaded into each cache are somewhat difficult to discover. manufacturers consider these strategies to be trade secrets.

what is clear is that accessing any of the levels of caching is significantly faster than going all the way out to pokey old ram to get information.

how much faster?

real numbers here are almost meaningless to discuss because they depend on so many factors within a given system, but generally speaking we can say that l1 cache hits (the fastest level of caching) could speed up memory access by two orders of magnitude or more.

an l1 cache hit can complete in half a nanosecond. for reference consider that a photon can only travel half a foot in that amount of time. fetching from main memory will generally take at least 100 nanoseconds.

got it, fast... therefore data locality?

exactly. if we can ensure that the data ruby accesses frequently is stored close together in main memory, we significantly increase our chances of winning a coveted spot in one of the caching levels.

one of the ways to accomplish this is to decrease the overall size of the entries themselves. the smaller the entries are, the more likely they are to end up in the same caching level.

in our turbohash implementation above our entries were stored as simple arrays, but in ruby-2.3.3 table entries were actually stored in a linked list. each of the entries contained a next pointer that pointed to the next entry in the list. if we can find a way to get by without that pointer and make the entries smaller we will take better advantage of the processor's built-in caching.

the new approach in ruby.2.4.0-rc1 actually goes even further than just removing the next pointer, it removes the entries themselves. instead we store the entries in a separate array, the "entries array", and we record the indexes for those entries in the bins array, referenced by their keys.

this approach is known as "open addressing".

open addressing

ruby has historically used "closed addressing" in its hash table, also known as "open hashing". the new alternative approach proposed by vladimir makarov uses "open addressing", also known as "closed hashing". i get that naming things is hard, but this can really get pretty confusing. for the rest of this discussion, i will only use open addressing to refer to the new implementation, and closed addressing to refer to the former.

the reason open addressing is considered open is that it frees us from the hash table. the table entries themselves are not stored directly in the bins anymore, as with a closed addressing hash table, but rather in a separate entries array, ordered by insertion.

open addressing uses the bins array to map keys to their index in the entries array.

let's set a value in an example hash that uses open addressing:

# ruby-2.4.0-rc1 irb> my_hash["some_key"] = "some_value" 

when we set "some_key" in an open addressing hash table ruby will use the hash of the key to determine where our new key-index reference should live in the bins array:

irb> "some_key".hash # => -3336246172874487271 

ruby first appends the new entry to the entries array, noting the index where it was stored. ruby then uses the hash above to determine where in the bins array to store the key, referencing that index.

remember that the entry itself is not stored in the bins array, the key only references the index of the entry in the entries array.

determining the bin

the lower bits of the key's hash itself are used to determine where it goes in the bins array.

because we're not using all of the available information from the key's hash this process is "lossy", and it increases the chances of a later hash collision when we go to find a bin for our key.

however, the cost of potential collisions is offset by the fact that choosing a bin this way is significantly faster.

in the past, ruby has used prime numbers to determine the size of the bins array. this approach gave some additional assurance that a hashing algorithm which didn't return evenly distributed hashes would not cause a single bin to become unbalanced in size.

the bin size was used to mod the computed hash, and because the bin size was prime, it decreased the risk of hash collisions as it was unlikely to be a common factor of both computed hashes.

since version 2.2.0 ruby has used bin array sizes that correspond to powers of two (16, 32, 64, 128, etc.). when we know the bin size is going to be a factor of two we're able to use the lower two bits to calculate a bin index, so we find out where to store our entry reference much more quickly.

what's wrong with prime modulo mapping?

dividing big numbers by primes is slow. dividing a 64-bit number (a hash) by a prime can take more than 100 cpu cycles for each iteration, which is even slower than accessing main memory.

even though the new approach may produce more hash collisions, it will ultimately improve performance, because collisions will probe the available bins linearly.

linear probing

the open addressing strategy in ruby 2.4 uses a "full cycle linear congruential generator".

this is just a function that generates pseudorandom numbers based on a seed, much like ruby's rand#rand method.

given the same seed the rand#rand method will generate the same sequence of numbers, even if we create a new instance:

irb> r = random.new(7) # => #<random:0x007fee63030d50> irb> r.rand(1..100) # => 48 irb> r.rand(1..100) # => 69 irb> r.rand(1..100) # => 26 irb> r = random.new(7) # => #<random:0x007fee630ca928> irb> r.rand(1..100) # => 48 irb> r.rand(1..100) # => 69 irb> r.rand(1..100) # => 26 # note that these values will be distinct for separate ruby processes. # if you run this same code on your machine you can expect to get different numbers. 

similarly a linear congruential generator will generate the same numbers in sequence if we give it the same starting values.

linear congruential generator (lcg)

this is the algorithm for a linear congruential generator:

xn+1 = (a * xn + c ) % m

for carefully chosen values of a, c, m and initial seed x0 the values of the sequence x will be pseudorandom.

here are the rules for choosing these values:

  • m must be greater than 0 (m > 0)
  • a must be greater than 0 and less than m (0 < a < m)
  • c must be greater than or equal to 0 and less than m (0 <= c < m)
  • x0 must be greater than or equal to 0 and less than m (0 <= x0 < m)

implemented in ruby the lcg algorithm looks like this:

irb> a, x_n, c, m = [5, 7, 3, 16] # => [5, 7, 3, 16] irb> x_n = (a * x_n + c) % m # => 6 irb> x_n = (a * x_n + c) % m # => 1 irb> x_n = (a * x_n + c) % m # => 8 

for the values chosen above that sequence will always return 6, 1 and 8, in that order. because i've chosen the initial values with some additional constraints, the sequence will also choose every available number before it comes back around to 6.

an lcg that returns each number before returning any number twice is known as a "full cycle" lcg.

full cycle linear congruential generator

for a given seed we describe an lcg as full cycle when it will traverse every available state before returning to the seed state.

so if we have an lcg that is capable of generating 16 pseudorandom numbers, it's a full cycle lcg if it will generate a sequence including each of those numbers before duplicating any of them.

irb> (1..16).map { x_n = (a * x_n + c) % m }.sort # => [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] 

these are the additional rules we must use when choosing our starting values to make an lcg full cycle:

  • c can't be 0 (c != 0)
  • m and c are relatively prime (the only positive integer that divides both of them is 1)
  • (a - 1) is divisible by all prime factors of m
  • (a - 1) is divisible by 4 if m is divisible by 4

the first requirement makes our lcg into a "mixed congruential generator". any lcg with a non-zero value for c is described as a mixed congruential generator, because it mixes multiplication and addition.

if c is 0 we call the generator a "multiplicative" congruential generator (mcg), because it only uses multiplication. an mcg is also known as a lehmer random number generator (lrng).

the last 3 requirements in the list up above make a mixed cycle congruential generator into a full cycle lcg. those 3 rules by themselves are called the hull-dobell theorem.

hull-dobell theorem

the hull-dobell theorem describes a mixed congruential generator with a full period (one that generates all values before repeating).

in ruby 2.4 vladimir has implemented an lcg that satisfies the hull-dobell theorem, so ruby will traverse the entire collection of bins without duplication.

remember that the new hash table implementation uses the lower bits of a key's hash to find a bin for our key-index reference, a reference that maps the entry's key to its index in the entries table.

if the first attempt to find a bin for a key results in a hash collision, future attempts will use a different means of calculating the hash.

the unused bits from the original hash are used with the collision bin index to generate a new secondary hash, which is then used to find the next bin.

when the first attempt results in a collision the bin searching function becomes a full cycle lcg, guaranteeing that we will eventually find a home for our reference in the bins array.

since this open addressing approach allows us to store the much smaller references to entries in the bins array, rather than the entirety of the entries themselves, we significantly decrease the memory required to store the bins array.

the new smaller bins array then increases our chances of taking advantage of the processor caching levels, by keeping this frequently accessed data structure close together in memory. vladimir improved the data locality of the ruby hash table.

so ruby is faster and vladimir is smart?

yup! we now have significantly faster hashes in ruby thanks to vladimir and a whole host of other ruby contributors. please make sure you make a point of thanking the ruby maintainers the next time you see one of them at a conference.

contributing to open source can be a grueling and thankless job. most of the time contributors only hear from users when something is broken, and maintainers can sometimes forget that so many people are appreciating their hard work every day.

want to make a contribution yourself?

the best way to express your gratitude for ruby is to make a contribution.

there are all sorts of ways to get started contributing to ruby, if you're interested in contributing to ruby itself check out the ruby core community page.

another great way to contribute is by testing preview versions as they’re released, and reporting potential bugs on the ruby issues tracker. watch the recent news page (rss feed) to find out when new preview versions are available.

if you don't have the time to contribute to ruby directly consider making a donation to ruby development:

is that everything new in ruby 2.4?

not even close. there are many more interesting updates to be found in the ruby 2.4 changelog.

here are a few of my favorites that i didn't have time to cover:

thank you so much for reading, i hope you have a wonderful holiday.

":heart:" jonan


Information

today we are announcing the newest version of the heroku cli. we know how much time you spend in the cli as developers and how much pride you take in being able to get things done quickly. our new cli has big improvements in performance as well as enhanced readability for humans and machines.

tuned for performance

cli response time is made up of two parts: the api response time and the performance of the cli itself, and the latter is where we’ve made big improvements. while a typical unix user should experience responses that are around half a second faster, the biggest gains are for windows users, as the new cli no longer has a ruby wrapper.

when we measured the time it takes for the info command in the old vs. new cli, it decreases from 1690 to 1210 milliseconds in unix, and 3409 to 944 milliseconds in windows! though individual results will vary, you should experience faster response times on average across all commands.

performance_windows

installing the new cli

you might have noticed some improvements over the last few months, but to get the fastest version you’ll need to uninstall and reinstall the cli, because we’ve rewritten it in node.js with new installers. the good news is that this should be the last manual update you’ll ever do for the heroku cli: the new cli will auto-update in the future.

the instructions to uninstall for mac os x users are to type the following:

$ rm -rf /usr/local/heroku $ rm -rf ~/.heroku ~/.local/share/heroku ~/.config/heroku ~/.cache/heroku 

then download and run the os x installer.

on windows, to uninstall the heroku cli:

  1. click start > control panel > programs > programs and features.
  2. select heroku cli, and then click uninstall.
  3. delete the .config/heroku directory inside your home directory.

then download and run the windows installer.

for the last few of you who are still using our very old ruby gem - now is a great time to upgrade to the full heroku cli experience. please let us know if you run into any issues with installation as we’re here to help!

improved readability for humans and machines

the new cli includes a number of user experience improvements that we’ve been rolling out over the past few months. here are some of our favorites.

grep-parseable output

we’ve learned that while you value human-readable output, you want grep-parseable output too. we’ve standardized the output format to make it possible to use grep.

for example, let’s look at heroku regions. heroku regions at one point showed output like the following:

heroku_regions_old

while this shows all the information about the available regions, and is arguably more readable for humans as it groups the two regions underneath their respective headers, you lose the ability to use grep to filter the data. here is a better way to display this information:

heroku_regions_new

now you can use grep to filter just common runtime spaces:

heroku_regions_grep

power up with the jq tool

if you want even better tools to work with a richer cli output, many commands support a --json flag. use the powerful jq tool to query the results.

heroku_regions_jq

$ heroku

we noticed that heroku was one of the top commands users run. we learned that many users were running it to get a holistic overview of their account. we re-ordered the output so it would be in an order that would make sense to you - and showing your starred apps first. we also added context that would give you a dashboard-style view of the current state of those apps and how they fit into the bigger picture, including pipeline info, last release info, metrics, and errors. at the end of the output, we give guidance on where you might want to go next - such as viewing add-ons or perhaps apps in a particular org.

heroku_dashboard

colors

we’ve used color to help you quickly read command output. we’ve given some nouns in the cli standard colors, so that you’ll easily spot them. in the example above you’ll notice that apps are purple, example commands are in blue, and the number of unread notifications is green. we typically specify errors and warning messages in yellow and red.

we’ve tried to be mindful with color. too many contrasting colors in the same place can quickly begin to compete for attention and reduce readability. we also make sure color is never the only way we communicate information.

you can always disable color as a user, by adding --no-color or setting color=false.

input commands: flags and args

our new cli makes greater use of flags over args. flags provide greater clarity and readability, and give you confidence that you are running the command correctly.

an old heroku fork command would look like this:

$ heroku fork destapp -a sourceapp 

which app is being forked and which app is the destination app? it’s not clear.

the new heroku fork has required flags:

$ heroku fork --from sourceapp --to destapp 

the input flags specify the source and destination with --from and --to so that it’s very clear. you can specify these flags in any order, and still be sure that you will get the correct result.

looking to the future, flags will allow us to provide autocomplete in a much better fashion than args. this is because when the user types:

$ heroku info --app <tab><tab> 

...we know without question that the next thing to complete is an app name and not another flag or other type of argument.

learn more

these are just some examples of the work we’ve been doing to standardize and improve the heroku cli user experience. you can read more in the heroku dev center cli article. we’ve also published a cli style guide that documents our point of view on cli design and provides a clear direction for designing delightful cli plugins.

as always, we love getting feedback from you so try out the new cli and let us know your thoughts.


Information

postgres is our favorite database—it’s reliable, powerful and secure. here are a few essential tips learned from building, and helping our customers build, apps around postgres. these tips will help ensure you get the most out of postgres, whether you’re running it on your own box or using the heroku postgres add-on.

use a connection pooler

postgres connections are not free, as each established connection has a cost. by using a connection pooler, you’ll reduce the number of connections you use and reduce your overhead.

most postgres client libraries include a built-in connection pooler; make sure you’re using it.

you might also consider using our pgbouncer buildpack if your application requires a large number of connections. pgbouncer is a server-side connection pooler and connection manager that goes between your application and postgres. check out some of our documentation for using pgbouncer for ruby and java apps.

set an application name

postgres allows you to see what clients are connected and what each of them is doing using the built-in pg_stat_activity table.

by explicitly marking each connection you open with the name of your dyno, using the dyno environment variable, you’ll be able to track what your application is doing at a glance:

set application_name to 'web.1'; 

now, if you will be to quickly see what each dyno is doing, using heroku pg:ps:

$ heroku pg:ps procpid | source | running_for | waiting | query ---------+---------------------------+-----------------+---------+----------------------- 31776 | web.1 | 00:19:08.017088 | f | <idle> in transaction 31912 | worker.1 | 00:18:56.12178 | t | select * from customers; (2 rows) 

you will also be able to see how many connections each dyno is using, and much more, by querying the pg_stat_activity table:

$ heroku pg:psql select application_name, count(*) from pg_stat_activity group by application_name order by 2 desc; application_name | count ----------------------------+------- web.1 | 15 web.2 | 15 worker.1 | 5 (3 rows) 

set a statement_timeout for web dynos

long running queries can have an impact on your database performance because they may hold locks or over-consume resources. to avoid them, postgres allows you to set a timeout per connection that will abort any queries exceeding the specified value. this is especially useful for your web dynos, where you don’t want any requests to run longer than your request timeout.

set statement_timeout to '30s'; 

track the source of your queries

being able to determine which part of your code is executing a query makes optimization easier, and makes it easier to track down expensive queries or n+1 queries.

there are many ways to track which part of your code is executing a query, from a monitoring tool like new relic to simply adding a comment to your sql specifying what code is calling it:

select `favorites`.`product_id` from `favorites` -- app/models/favorite.rb:28:in `block in <class:favorite>' 

you will now be able to see the origin of your expensive queries, and be able to track down the caller of the query when using the pg_stat_statements and pg_stat_activity tables:

$ heroku pg:psql select (total_time/sum(total_time) over()) * 100 as exec_time, calls, query from pg_stat_statements order by total_time desc limit 10; ---------------------------------------------------------------------------------------------------------------------------- exec_time | 12.2119460729825 calls | 7257 query | select `favorites`.`product_id` from `favorites` -- app/models/product.rb:28:in `block in <class:product>' 

many orms provide this feature built-in or via extensions, make sure you use it and your debugging and optimization will be easier.

learn more

there is much more you can learn about postgres, either via the excellent documentation of the project itself, or the heroku postgres dev center reference. share your own tips with the community on the #postgrestips hashtag.


Information

postgresql 9.6 is now generally available for heroku postgres. the main focus of this release is centered around performance. postgresql 9.6 includes enhanced parallelism for key capabilities that sets the stage for significant performance improvements for a variety of analytic and transactional workloads.

with 9.6, certain actions, like individual queries, can be split up into multiple parts and performed in parallel. this means that everything from running queries, creating indexes, and sorting have major improvements that should allow a number of different workloads to execute faster than they had in prior releases of postgresql. with 9.6, the postgresql community, along with heroku’s own open source contributions to this release (a special thanks to peter geoghegan), have laid the foundation to bring those enterprise-class features to the world’s most advanced open source relational database.

parallelism, performance, and scale

performance in the form of parallelism means that more work can be done at the same time. one of the areas where this makes a big difference is when postgres needs to scan an entire table to generate a resultset.

imagine for a moment that your postgresql installation has a table in it called emails that stores all of the emails being sent by customers within an application. let’s say that one of the features that’s provided to customers as part of the application is giving counts on the number of emails being sent to particular email addresses, filtered by the type of person that’s receiving the email. that query might look something like this:

select e.to , count(*) as total from emails e where e.person_type = ‘executives’ group by e.to 

in this scenario, if our customers have been sending a large number of emails to executives, an index on the table would not help on person_type column because rows with executive in the person_type column represent too many of the rows in the table. in that case, postgresql will resort to scanning all of the rows in the database to find matches for executives.

for relatively small tables, say thousands of rows, postgresql might be able to perform this quickly. but, if the table has 100 million rows or more, that query will slow to a crawl because it needs to scan every single row. in the 9.6 release, postgresql will be able to break apart the above query and search portions of the table at the same time. this should greatly speed up queries that require full table scans, which happens more often than you think in analytics-based workloads.

the performance improvements in 9.6 weren’t limited to sequential scans on large tables. much of the work heroku contributed to this release was in the way of improved sorting. one of the areas where you’ll see considerable improvement is when you create indexes concurrently. under the hood, each row in a table has what’s called a tuple id (tid), not to be confused with an object id. a tid consists of two parts, a block and a row index. together, the tid identifies where the row can be found within the physical structure of the table. our patch to this code took the tuple ids and transformed them into a different format prior to sorting in the index which would allow postgresql to sort the tids even faster.

with our contributions to sorting, when you want to create an index concurrently by using the create index concurrently syntax, you can experience up to a 50% performance improvement on index creation in certain cases. this is an amazing patch because when create index concurrently is used, it won’t lock writes to the table in question like create index would. this allows your application to operate like it normally would without adverse effects.

notable improvements

beyond the work done on parallelism, postgresql 9.6 has a number of noteworthy improvements:

  • postgresql foreign data wrapper now supports remote updates, joins and batch updates. that you can distribute workloads across many different postgresql instances.

  • full text search can now search for adjacent words.

  • improvements to administrative tasks like vacuum which shouldn’t scan pages unnceccesarily. this is particularly useful for tables that are append-only like events or logs.

getting started

when a new heroku postgres database is provisioned on any one of the heroku plan tiers, whether on the common runtime or in private spaces, 9.6 will be the default version. if you have an existing database on the platform, please check out our documentation for upgrading. this is an exciting update to postgresql that should have many benefits for the workloads that run on heroku. give postgresql 9.6 a spin and let us know how we can make postgresql even better. together, we can make postgresql one of the best relational databases in the business!


Information

heroku recently released a managed apache kafka offering. as a node.js developer, i wanted to demystify kafka by sharing a simple yet practical use case with the many node.js developers who are curious how this technology might be useful. at heroku we use kafka internally for a number of uses including data pipelines.  i thought that would be a good place to start.

when it comes to actual examples, java and scala get all the love in the kafka world.  of course, these are powerful languages, but i wanted to explore kafka from the perspective of node.js.  while there are no technical limitations to using node.js with kafka, i was unable to find many examples of their use together in tutorials, open source code on github, or blog posts.  libraries implementing kafka’s binary (and fairly simple) communication protocol exist in many languages, including node.js.  so why isn’t node.js found in more kafka projects?

i wanted to know if node.js could be used to harness the power of kafka, and i found the answer to be a resounding yes.

moreover, i found the pairing of kafka and node.js to be more powerful than expected.  functional reactive programming is a common paradigm used in javascript due to the language’s first-class functions, event loop model, and robust event handling libraries.  with frp being a great tool to manage event streams, the pairing of kafka with node.js gave me more granular and simple control over the event stream than i might have had with other languages.

continue reading to learn more about how i used kafka and functional reactive programming with node.js to create a fast, reliable, and scalable data processing pipeline over a stream of events.

the project

i wanted a data source that was simple to explain to others and from which i could get a high rate of message throughput, so i chose to use data from the twitter stream api, as keyword-defined tweet streams fit these needs perfectly.

fast, reliable, and scalable.  what do those mean in this context?

  • fast.  i want to be able to see data soon after it is received -- i.e. no batch processing.
  • reliable.  i do not want to lose any data.  the system needs to be designed for “at least once” message delivery, not “at most once”.
  • scalable.  i want the system to scale from ten messages per second to hundreds or maybe thousands and back down again without my intervention.

so i started thinking through the pieces and drew a rough diagram of how the data would be processed.

data pipeline

each of the nodes in the diagram represents a step the data goes through.  from a very high level, the steps go from message ingestion to sorting the messages by keyword to calculating aggregate metrics to being shown on a web dashboard.

i began implementation of these steps within one code base and quickly saw my code getting quite complex and difficult to reason about.  performing all of the processing steps in one unified transformation is challenging to debug and maintain.

take a step back

i knew there had to be a cleaner way to implement this.  as a math nerd, i envisioned a way to solve this by composing simpler functions -- maybe something similar to the posix-compliant pipe operator that allows processes to be chained together.

javascript allows for various programming styles, and i had approached this initial solution with an imperative coding style.  an imperative style is generally what programmers first learn, and probably how most software is written (for good or bad).  with this style, you tell the computer how you want something done.

contrast that with a declarative approach in which you instead tell the computer what you want to be done.  and more specifically, a functional style, in which you tell the computer what you want done through composition of side-effect-free functions.

here are simple examples of imperative and functional programming.  both examples result in the same value.  given a list of integers, remove the odd ones, multiply each integer by 10, and then sum all integers in the list.

imperative

const numlist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] let result = 0; for (let i = 0; i < numlist.length; i++) { if (numlist[i] % 2 === 0) { result += (numlist[i] * 10) } } 

functional

const numlist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] const result = numlist .filter(n => n % 2 === 0) .map(n => n * 10) .reduce((a, b) => a + b, 0) 

both complete execution with result equal to 300, but the functional approach is much easier to read and is more maintainable.

if that’s not readily apparent to you, here’s why: in the functional example, each function added to the “chain” performs a specific, self-contained operation.  in the imperative example, the operations are mashed together.  in the functional example, state is managed for me within each function, whereas i have to manage changing state (stored in the result variable) during execution in the imperative version.

these may seem like small inconveniences, but remember this is just a simple example.  in a larger and more complex codebase the minor inconveniences like these accumulate, increasing the cognitive burden on the developer.

the data processing pipeline steps were screaming to be implemented with this functional approach.

but what about the reactive part?

functional reactive programming, “is a programming paradigm for reactive programming (asynchronous dataflow programming) using the building blocks of functional programming (e.g. map, reduce, filter)" [frp].  in javascript, functional reactive programming is mostly used to react to gui events and manipulate gui data.  for example, the user clicks a button on a web page, the reaction to which is an xhr request which returns a chunk of json.  the reaction to the successfully returned chunk of json is a transformation of that data to an html list, which is then inserted into the webpage’s dom and shown to the user.  you can see patterns like this in the examples provided by a few of the popular javascript functional reactive programming libraries: bacon.js, rxjs, flyd.

interestingly, the functional reactive pattern also fits very well in the data processing pipeline use case.  for each step, not only do i want to define a data transformation by chaining functions together (the functional part), but i also want to react to data as it comes in from kafka (the reactive part).  i don’t have the luxury of a fixed length numlist.  the code is operating on an unbounded stream of values arriving at seemingly random times.  a value might arrive every ten seconds, every second, or every millisecond.  because of this i need to implement each data processing step without any assumptions about the rate at which messages will arrive or the number of messages that will arrive.

i decided to use the lodash utility library and bacon.js frp library to help with this.  bacon.js describes itself as, “a small functional reactive programming lib for javascript. turns your event spaghetti into clean and declarative feng shui bacon, by switching from imperative to functional...stop working on individual events and work with event-streams instead [emphasis added]” [bac].

kafka as the stream transport

the use of event streams makes kafka an excellent fit here.  kafka’s append-only, immutable log store serves nicely as the unifying element that connects the data processing steps.  it not only supports modeling the data as event streams but also has some very useful properties for managing those event streams.

  • buffer: kafka acts as a buffer, allowing each data processing step to consume messages from a topic at its own pace, decoupled from the rate at which messages are produced into the topic.
  • message resilience: kafka provides tools to allow a crashed or restarted client to pick up where it left off.  moreover, kafka handles the failure of one of its servers in a cluster without losing messages.
  • message order: within a kafka partition, message order is guaranteed. so, for example, if a producer puts three different messages into a partition, a consumer later reading from that partition can assume that it will receive those three messages in the same order.
  • message immutability: kafka messages are immutable.  this encourages a cleaner architecture and makes reasoning about the overall system easier.  the developer doesn’t have to be concerned (or tempted!) with managing message state.
  • multiple node.js client libraries: i chose to use the no-kafka client library because, at the time, this was the only library i found that supported tls (authenticated and encrypted) kafka connections by specifying brokers rather than a zookeeper server.  however, keep an eye on all the other kafka client libraries out there: node-kafka, kafka-node, and the beautifully named kafkaesque.  with the increasing popularity of kafka, there is sure to be much progress in javascript kafka client libraries in the near future.

putting it all together: functional (and reactive) programming + node.js + kafka

this is the final architecture that i implemented (you can have a look at more details of the architecture and code here).

twitter data processing pipeline architecture

data flows from left to right.  the hexagons each represent a heroku app.  each app produces messages into kafka, consumes messages out of kafka, or both.  the white rectangles are kafka topics.

starting from the left, the first app ingests data as efficiently as possible.  i perform as few operations on the data as possible here so that getting data into kafka does not become a bottleneck in the overall pipeline.  the next app fans the tweets out to keyword- or term-specific topics.  in the example shown in the diagram above, there are three terms.  

the next two apps perform aggregate metric calculation and related term counting.  here’s an example of the functional style code used to count the frequency of related terms.  this is called on every tweet.

function wordfreq(accumulator, string) { return _.replace(string, /[\.!\?"'#,\(\):;-]/g, '') //remove special characters     .split(/\s/)     .map(word => word.tolowercase())     .filter(word => ( !_.includes(stopwords, word) )) //dump words in stop list     .filter(word => ( word.match(/.{2,}/) )) //dump single char words     .filter(word => ( !word.match(/\d+/) )) //dump all numeric words     .filter(word => ( !word.match(/http/) )) //dump words containing http     .filter(word => ( !word.match(/@/) )) //dump words containing @     .reduce((map, word) =>       object.assign(map, {         [word]: (map[word]) ? map[word] + 1 : 1,       }), accumulator     ) } 

a lot happens here, but it’s relatively easy to scan and understand.  implementing this in an imperative style would require many more lines of code and be much harder to understand (i.e. maintain).  in fact, this is the most complicated data processing step.  the functional implementations for each of the other data processing apps are even shorter.

finally, a web application serves the data to web browsers with some beautiful visualizations.

kafka twitter dashboard

summary

hopefully, this provided you with not only some tools but also the basic understanding of how to implement a data processing pipeline with node.js and kafka.  we at heroku are really excited about providing the tools to make evented architectures easier to build out, easier to manage, and more stable.

if you are interested in deploying production apache kafka services and apps at your company, check out our apache kafka on heroku dev center article to get started.


Information

today we are announcing a significant enhancement to heroku external objects: write support. salesforce users can now create, read, update, and delete records that physically reside in any heroku postgres database from within their salesforce deployment.

increasingly, developers need to build applications with the sophistication and user experience of the consumer internet, coupled with the seamless customer experience that comes from integration with salesforce. heroku external objects enable a compelling set of integrations scenarios between heroku and salesforce deployments, allowing postgres to be updated based on business processes or customer records in salesforce.

with heroku external objects, data persisted in heroku postgres is presented as an external object in salesforce. external objects are similar to custom objects, except that they map to data located outside your salesforce org, and are made available by reference at run time.

integration with salesforce connect

heroku external objects is built to seamlessly integrate with salesforce connect using the odata 4.0 standard. salesforce connect enables access to data from a wide variety of external sources, in real-time, without the need to write and maintain integration code. this ‘integration by reference’ approach has a number of compelling benefits:

  • efficiency: fast time to value, absence of custom integration code, and reduced storage footprint.

  • low latency: accessing external objects results in data being fetched from the external system in real time, eliminating the risk of data becoming stale over time.

  • flexibility: external objects in salesforce share many of the same capabilities as custom objects such as the ability to define relationships, search, expose in lists and chatter feeds, and support for crud operations.

  • platform integration: external objects can be referenced in apex, lightning and visualforce, and accessed via the force.com apis.

common usage patterns

we have many heroku postgres customers with multi-terabyte databases, which are used in service to an incredibly diverse range of applications. when it comes to integrating this data with salesforce, we tend to see two, non-exclusive integration patterns: salesforce as source of truth and postgres as source of truth.

salesforce as source of truth scenarios often entail updates originating from an external application to core salesforce objects such as orders, accounts, and contacts. because this data inherently belongs in salesforce, heroku connect synchronization is the preferred solution. with heroku connect, you can configure high-scale, low latency data synchronization between salesforce and postgres in a handful of mouse clicks.

postgres as source of truth scenarios typically require exposing discrete, contextually informed data points, such as an order detail, external status, or computed metric within salesforce. physically copying this type of data into salesforce would be inefficient and result in some degree of latency. heroku external objects allows data in postgres to be exposed as a salesforce external object, which is queried on access to facilitate real-time integration.

heroku external objects is the newest data integration service of heroku connect and available today with heroku connect. for more information and documentation, visit the heroku connect page, the heroku dev center or the documentation on force.com. for more information on salesforce connect, head on over to the trailhead.


Information

at rubykaigi i caught up with matz, koichi, and aaron patterson aka tenderlove to talk about ruby 3x3 and our path so far to reach that goal. we discussed koichi’s guild proposal, just-in-time compilation and the future of ruby performance.

jonan: welcome everyone. today we are doing an interview to talk about new features coming in ruby 3. i am here with my coworkers from heroku, sasada koichi and yukihiro matsumoto, along with aaron patterson from github.

jonan: so, last year at rubykaigi you announced an initiative to speed up ruby by three times by the release of version three. tell us more about ruby 3x3.

matz: in the design of the ruby language we have been primarily focused on productivity and the joy of programming. as a result, ruby was too slow, because we focused on run-time efficiency, so we’ve tried to do many things to make ruby faster. for example the engine in ruby 1.8 was very slow, it was written by me. then koichi came in and we replaced the virtual machine. the new virtual machine runs many times faster. ruby and the ruby community have continued to grow, and some people still complain about the performance. so we are trying to do new things to boost the performance of the virtual machine. even though we are an open source project and not a business, i felt it was important for us to set some kind of goal, so i named it ruby 3x3. the goal is to make ruby 3 run three times faster as compared to ruby 2.0. other languages, for example java, use the jit technique, just in time compilation; we don't use that yet in ruby. so by using that kind of technology and with some other improvements, i think we can accomplish the three times boost.

aaron: so it’s called ruby 3x3, three times three is nine and jruby is at nine thousand. should we just use jruby?

jonan: maybe we should. so ruby 3x3 will be three times faster. how are you measuring your progress towards that goal? how do we know? how do you check that?

matz: yes, that's an important point. so in the ruby 3x3 project, we are comparing the speed of ruby 3.0 with the speed of ruby 2.0. we have completed many performance improvements in ruby 2.1 and 2.3, so we want to include that effort in ruby 3x3. the baseline is ruby 2.0. this is the classification.

aaron: so your rails app will not likely be three times faster on ruby 3?

matz: yeah. our simple micro-benchmark may run three times faster but we are worried that a real-world application may be slower, it could happen. so we are going to set up some benchmarks to measure ruby 3x3. we will measure our progress towards this three times goal using those benchmark suites. we haven't set them all up yet but they likely include at least optcarrot (an nes emulator) and some small rails applications, because rails is the major application framework for the ruby language. we’ll include several other types of benchmarks as well. so we have to set that up, we are going to set up the benchmark suites.

jonan: so, koichi recently made some changes to gc in ruby.we now use a generational garbage collector. beyond the improvements that have been made already to gc, what possibility is there for more improvement that could get us closer to ruby 3x3? do you think the gc changes are going to be part of our progress there?

koichi: as matz says ruby’s gc is an important program, it has a huge overhead. however, the recent generational garbage collector i don't think has nearly as much overhead. maybe only ten percent of ruby’s time is spent in gc, or something like that. if we can speed up garbage collection an additional ten times, it's still only ten percent of the overall time. so sure we should do more for garbage collection, but we have lots of other more impactful ideas. if we have time and specific requests for gc changes, we will certainly consider those.

aaron: … and resources...

koichi: yes.

aaron: the problem is, since, for us at github we do out-of-band garbage collections, garbage collection time makes no difference on the performance of the requests anyway. so even if garbage collection time is only ten percent of the program and we reduce that to zero, say garbage collection takes no time at all, that's not three times faster so we wouldn't make our goal anyway. so, maybe, gc isn't a good place to focus for the ruby 3x3 improvements.

matz: yeah we have already added the generational garbage collector and incremental garbage collection. so in some cases, some applications, large web applications for example, may no longer need to do that out-of-band garbage collection.

aaron: yeah, i think the only reason we are doing it is because we are running ruby 2.1 in production but we're actually on the path to upgrading. we did a lot of work to get us to a point where we could update to ruby 2.3, it may be in production already. my team and i did the work, somebody else is doing the deployment of it, so i am not sure if it is in production yet but we may soon be able to get rid of out-of-band collection anyway.

matz: yes in my friend's site, out-of-band collection wasn’t necessary after the deployment of ruby 2.3.

jonan: so the gc situation right now is that gc is only maybe about ten percent of the time it takes to run any ruby program anyway. so, even if we cut that time by half, we're not going to win that much progress.

matz: it's no longer a bottleneck so the priority is lower now.

jonan: at railsconf, aaron talked about memory and memory fragmentation in ruby. if i remember correctly it looked to me like we were defragging memory, which is addressed, so in my understanding that means that we just point to it by the address; we don't need to put those pieces of memory close together. i'm sure there's a reason we we might want to do that; maybe you can explain it aaron.

aaron: sure. so, one of the issues that we had at, well, we have this issue at github too, is that our heap gets fragmented. we use forking processes, our web server forks, and eventually it means that all of the memory pages get copied out at some point. this is due to fragmentation. when you have a fragmented heap, when we allocate objects, we are allocating into those free slots and so since we're doing writes into those slots, it will copy those pages to child processes. so, what would be nice, is if we could eliminate that fragmentation or reduce the fragmentation and maybe we wouldn't copy the child pages so much. doing that, reducing the fragmentation like that, can improve locality but not necessarily. if it does, if you are able to improve the locality by storing those objects close to each other in memory, they will be able to hit caches more easily. if they hit those caches, you get faster access, but you can't predict that. that may or may not be a thing, and it definitely won't get us to ruby 3x3.

jonan: alright.

matz: do you have any proof on this? or a plan?

aaron: any plan? well yes, i prepared a patch that...

matz: making it easier to separate the heap.

aaron: yes, two separate heaps. for example with classes or whatever types with classes, we’ll allocate them into a separate heap, because we know that classes are probably not going to get garbage collected so we can put those into a specific location.

koichi: do you have plans to use threads at github?

aaron: do i have plans to use threads at github? honestly, i don't know. i doubt it. probably not. we'll probably continue to use unicorn in production. well i mean we could but i don't see why. i mean we're working pretty well and we're pretty happy using unicorn in production so i don't think we would switch. honestly, i like the presentation that you gave about guilds, if we could use a web server based on guilds, that would be, in my opinion, the best way.

matz: yes, i think it's promising.

jonan: so these guilds you mentioned (koichi spoke about guilds at rubykaigi), maybe now is a good time to discuss that. do you want to tell us about guilds? what they are and how that affect plans for ruby 3x3?

matz: we have three major goals in ruby 3. one of them is performance, which is that our program is really running three times faster. the second goal is the concurrency model, which is implemented by something like ruby guilds.

koichi: so concurrency and parallelism utilize some cpu cores.

matz: yeah, i say concurrency just because the guild is the concurrency model from the programmer's view. implementation-wise it should be parallelism.

koichi: i'm asking about the motivation of the concurrency.

matz: motivation of the concurrency?

koichi: not only the performance but also the model.

matz: well we already have threads. threads are mostly ok but it doesn't run parallel, due to the existing gil. so guilds are a performance optimization. concurrency by guilds may make the threading program or the concurrency runtime program faster, but the main topic is the data abstraction for concurrent projects.

jonan: ok. so while we are on the topic of threads i am curious. i've heard people talk about how it might be valuable to have a higher level of abstraction on top of threads because threads are quite difficult to use safely. have you all thought about adding something in addition to threads that maybe protects us from ourselves a little bit around some of those problems? is that what guilds are?

aaron: yes, that's essentially what the guild is, it's a higher level abstraction so you can do parallel programming safely versus threads where it's not safe at all. it's just...

koichi: yes, so it's a problem with concurrency in ruby now; sharing mutable objects between threads. the idea of guilds, the abstraction more than guilds specifically, is to prohibit sharing of mutable objects.

jonan: so when i create an object how would i get it into a guild? if i understand correctly, you have two guilds - a and b - and they both contain mutable objects. with the objects in a, you could run a thread that used only those objects, and run a different thread that used only objects in b, and then you would eliminate this problem and that's why guilds will exist. but how do i put my objects into guilds or move them between guilds? have you thought about it that far yet?

matz: yeah, a guild is like some kind of bin, a container with objects. with it, you cannot access the objects inside the guild from outside, because the objects are members of the guild. however, you can transfer the objects from one guild to another. so, by transferring, the new objects can be accessed in the destination guild.

jonan: i see, ok. so the objects that are in a guild can't be accessed from outside that guild; other guilds can't get access to them. then immutable objects are not members of guilds. they are outside.

koichi: so immutable objects are something like freelance objects. freelance objects are immutable, so any guild can access them because there are no read-write conflicts.

jonan: so you would just use pointers to point to those immutable objects?

koichi: yes. also, i want to note that immutable doesn't mean frozen object. frozen objects can contain mutable objects. so i mean those immutable objects which only contain children that point to immutable objects.

jonan: so if we had a nested hash, some large data structure, we would need to freeze every object in that in order to reference it this way. is there a facility in ruby right now to do that? i think i would have to iterate over that structure freezing objects manually today.

matz: not yet.

jonan: so there might be?

matz: we need to provide something to freeze these objects.

aaron: a deep freeze.

matz: yes, deep freeze.

jonan: deep freeze is the name of this feature maybe? i think that would be an excellent name for it.

aaron: i like deep freeze. (koichi would like to note that the name for this feature has not yet been determined)

jonan: i think you mentioned it earlier but maybe you could tell us a little more about just in time compilation, the jit, and how we might approach that in ruby 3.

matz: the jit is a pretty attractive technology for gaining performance. you know, as part of the ruby 3x3 effort we are probably going to introduce some kind of jit. many other virtual machines have introduced the llvm jit. however, personally, i don't want to use the llvm jit for ruby 3, just because the llvm itself is a huge project, and it's much younger than ruby. ruby is more than twenty years old. it's possibly going to live for twenty more years, or even longer, so relying on other huge projects is kind of dangerous.

aaron: what do you think of shyouhei’s stuff?

matz: the optimizer?

aaron: yeah.

matz: yeah, it's quite interesting, but its application is kind of limited. we have to measure it.

koichi: i think shyouhei’s project is a good first step, but we need more time to consider it.

jonan: can you explain what it is?

aaron: yeah, so shouhei, what he did was he...

matz: de-optimization.

aaron: yeah he introduced a de-optimization framework that essentially lets us copy old instructions, or de-optimized instructions, into the existing instruction sequences. so he can optimize instructions and if anything happens that would… well, i guess i should step back a little bit. so if you write, in ruby, 2 + 4, typically the plus operator is not overwritten. so if you can make that assumption then maybe we can collapse that down and replace it with just six. right?

jonan: i see.

aaron: but if somebody were to override the plus method, we would have to not do that class because we wouldn't know what the plus does. and in order to do that, we have to de-optimize and go back to the original instructions that we had before. so, what shouhei did was he introduced this de-optimization framework. it would allow us to take those old instructions and copy them back in, in case someone were to do something like what i described, overriding plus.

matz: jruby people implement very nice de-optimization technologies. they made just such a de-optimization framework on the java virtual machine, so on this topic at least they are a bit ahead of us.

aaron: well the one thing, the other thing that i don't know; if you watch the jruby or jruby plus truffle stuff, if you read any of the papers about it, there are tradeoffs, the jit isn't free. i mean we have to take into consideration how much memory usage that will require. people hearing this shouldn't think "oh well let's just add a jit that's all we have to do and then it will be done". it’s much harder, there are more tradeoffs than just simply add a jit.

jonan: yes. so there was an older implementation, rujit, the ruby jit, but rujit had some memory issues didn't it?

koichi: yes, quite severe. it needed a lot of memory. such memory consumption is controllable, however, so we can configure how much memory they can use.

jonan: ok, so you just set a limit for how much the jit uses and then it would do the best it could with what you had given it, basically?

koichi: yeah.

jonan: ok.

koichi: rujit can improve the performance of micro-benchmarks but i’m not sure about the performance in larger applications.

jonan: so, for rails applications maybe we should call it "ruby 1.2x3" or something.

aaron: i think that's an interesting question to bring up because if a rails application is part of the base benchmarks, are we really going to make a rails application three times faster?

matz: we need to make our performance number calculations pretty soon. this is a big problem i think. so maybe some kind of operation such as concatenating...

aaron: concatenation, yeah.

matz: … or temporary variable creation or something like that, we can improve the performance.

aaron: so, i think it's interesting if we come up with a benchmark that's using string concatenation. i mean we could use an implementation for that. for example, what if we used ropes instead. if we did that, maybe string concatenation would become very fast, but we didn't really improve the virtual machine at all, right? so, how do we balance, does that make sense? how do we balance those things?

matz: so unlike the typical application, the language can be applied anywhere, so it can be used to write rails applications, or science applications, or games, so i don't think improving that one thing will necessarily change that situation. so we have to do everything, maybe introducing ropes, introducing a jit in some form, introducing some other mechanisms as well to see that improvement. we have to do it.

aaron: so maybe the key is in the benchmarks that we have. we have something doing a lot of string concatenations, something doing a lot of math, maybe something doing, i don't know, i/o. something like that?

matz: yeah. we have to. we cannot be measured by one single application, we need several.

aaron: right.

matz: and then in the rails application we have to avoid the database access. just because, you know, access to the database is slow, can be very slow. that we cannot improve.

jonan: so, along with the jit, you've also talked about some type changes to coming to ruby 3 and the optional static types. can you tell us about that?

matz: yeah, the third major goal of the ruby 3 is adding some kind of static typing whille keeping the duck typing, so some kind of structure for soft-typing or something like that. the main goal of the type system is to detect errors early. so adding this kind of static type check or type interfaces does not affect runtime.

matz: it’s just a compile time check. maybe you can use that kind of information in ides so that the editors can use that data for their code completion or something like that, but not for performance improvement.

aaron: you missed out on a really good opportunity for a pun.

jonan: did i? what was the pun?

aaron: you should have said, "what type of changes will those be?"

jonan: what type of type changes would those be? yes. i've been one-upped once again, by pun-master aaron here.

aaron: i was holding it in, i really wanted to say something.

jonan: you looked up there suddenly and i thought, did i move on too early from the jit discussion? no, it was a pun. that was the pun alert face that happened there, good. i'm sorry that we missed the pun. so, to summarize then, the static type system is not something that will necessarily improve performance...

koichi: yes.

jonan: ...but it would be an optional static type system, and it would allow you to check some things before you're running your program and actually running into errors.

matz: yeah, and if you catch those errors early you can improve your productivity.

jonan: yes, developer productivity.

matz: yeah.

jonan: which is, of course, the primary goal of ruby, or developer happiness rather, not necessarily productivity. so, the jit, this just in time compiler, right now ruby has ahead of time compilation (aot) optionally? there's some kind of aot stuff that you can do in ruby?

matz: i don't code with it.

aaron: “some, kind of”.

jonan: ok.

aaron: it has a framework built in to allow you to build your own aot compiler. it has the tools in there to let you build an aot compiler, and i think you wrote a gem, the...

koichi: yeah, yomikomu.

aaron: yeah.

jonan: ok. yomikomu is an aot compiler for ruby. can you describe just a little bit what that means? what ahead of time compilation would mean in this case? what does it do?

koichi: ruby compiles at runtime, so we could store the compiled binary to the file system or something, some database or somewhere. the yomikomu gem uses this feature, writing out instruction sequences to the file system at runtime, so we can skip the compiler tool in the future. it’s only a small improvement, i think, maybe 30%.

aaron: 30%?

matz: 30% is huge.

aaron: yeah!

jonan: that seems like a pretty good improvement to me.

koichi: i do think so.

aaron: we just need a few more 30% improvements then ruby 3x3 is done.

matz: yeah, that means 30% of the time is spent in the compiler.

koichi: yeah, in 2.3.

matz: that’s huge!

aaron: that's what i said!

jonan: so, rather than jit, have you thought about maybe like a little too late compiler? we could just compile after the program runs and we don't need to compile it all then. maybe wouldn’t be as popular as a just in time compiler.

aaron: one thing i think would be interesting, one thing that i'd like to try someday, is to take the bytecode that's been written out and analyze it. so we could know for example that we can use this trick that shyouhei’s doing with constant folding. since we have all of the bytecode written out, you should be able to tell by analyzing the bytecode whether or not... actually maybe you couldn't tell that. i was going to say we could analyze the bytecode and optimize it with code, rewriting an optimized version to disk. but since you can do so much stuff at runtime, i don't know if it would work in all cases.

koichi: this is exactly what the jit or some kind of comparable approach aims to do.

aaron: yeah.

jonan: so, cases like you were talking about earlier where this plus can be overridden in ruby, so what you would do is assume the plus is not overridden and you would just put six, you would actually write that into the bytecode, just the result of this value. then this framework would allow you to later, if someone did overwrite the plus method dynamically while the program was running, to swap it out again for the old implementation.

aaron: yes.

jonan: ok.

aaron: so basically the public service announcement is: "don't do that."

jonan: don't do that. don't override plus.

aaron: just stop it.

jonan: just stop it. you're going to make the ruby team's life harder.

koichi: yes, lots harder.

jonan: ok. is there anything else you would like to add about ruby 3? anything we didn't touch on today that might be coming?

matz: you know, we’ve been working on ruby 3 for maybe two years right now, but we are not expecting to release in a year or even two. maybe by 2020?

aaron: does that mean that we have to wait, are we really going to wait for ruby 3 to introduce guilds? or are we going to introduce that before ruby 3?

matz: before ruby 3 i guess.

aaron: ok.

matz: yeah, we still have a lot of things to do to implement guilds.

aaron: of course.

matz: for example, the garbage collection is pretty difficult. the isolated threads can't access the same objects in that space, so it will be very difficult to implement garbage collection. i think we’ve had a lot of issues with that in the past, so that could take years. but if we’re done, we are happy to introduce guilds into maybe ruby 2... 6?.

aaron: 2.6, yeah.

matz: so this is because we don't want to break compatibility. so if a program isn’t using guilds it should run the same way.

jonan: so this is how we are able to use immutable objects in ruby, but they’re frozen objects. they can’t be unfrozen.

matz: no.

jonan: ok.

koichi: freezing is a one-way operation.

aaron: yes.

jonan: ok. so then, a friend asked me when i described guilds, he writes a lot of haskell, he asked me when we are we going to have "real immutable objects", and i don't quite know what he means. is there some distinction between an immutable object in ruby and an immutable object in a different language that’s important?

matz: for example in haskell, everything is immutable, it’s that kind of language, everything is immutable from day one.

jonan: yes.

matz: but in ruby we have mutable objects, so under that kind of situation we need a whole new construct.

aaron: frozen objects should really be immutable. it's really immutable.

jonan: ok.

aaron: i don't...

jonan: you don't know what this person who air-quoted me "real immutable" was saying?

aaron: yeah i don't know why they would say "real immutable".

jonan: should i unfriend him on facebook? i think i'm going to after this.

matz: at least tell him if you want "real immutable" go ahead and use haskell.

jonan: i think that's an excellent option, yeah.

aaron: you just to need to say to them quit "haskelling" me.

jonan: i should, i’ll just tell them to quit "haskelling" me about immutable objects. well, it has been a pleasure. thank you very much for taking the time. we've run a little bit longer than promised but i think it was very informative, so hopefully people get a lot out of it. thank you so much for being here.


Information

kyle seaman is director of farm technology for freight farms, producer of pre-assembled, iot-enabled, hydroponic farms inside repurposed freight containers. read the freight farms customer story to learn more about how heroku has helped the company scale their business.

what is freight farms?

our flagship product, the leafy green machine (lgm), is a complete, commercial-ready, hydroponic growing system assembled inside a repurposed shipping container. each of our 100+ farms is connected to an iot network built on heroku.

tell us about your stack.

we’re running the open source version of the parse server on heroku. our stack is mostly javascript: mongodb along with a node.js api. we also use heroku postgres.

xively is a core component of our stack. we use the xively add-on to sync our heroku apps with their iot cloud platform enabling us to connect the physical sensors and gateway devices located in our individual farm units to our software system running on heroku.

our favorite heroku add-on is papertrail, which has been great with helping us monitor our apps. heroku connect has also made it really easy to sync data with our salesforce crm instance.

our customer-facing app, farmhand, is a native ios and react web app. farmhand's connect features enable farmers to track, monitor, and control the climate and growing conditions of their farm remotely. farmhand’s shop is an e-commerce app where farmers can buy supplies.

how does data flow through your hardware and software systems?

every farm is provisioned with a gateway device that’s running a xively mqtt client locally. it’s connected to the other hardware devices, such as ip security cameras, wireless sensors, and the farm’s automation controller. xively serves as our mqtt communication gateway and broker.

freight farm architectural diagram

every minute, the gateway is collecting 100+ data points that give us information about the farm environment. this ranges from simple data, such as whether the lights are on or off, to more complex readings of nutrient levels, co2, and temperature.

the local mqtt client running in each shipping container is a lua application. it connects each farm to xively’s cloud mqtt broker. this means that any other client can subscribe to listen in. as data is being passed up to the mqtt cloud broker, our server on heroku is listening so it can process that data in real-time.

the iot communication gateway and time series tracking is provided by xively, and the logic is happening on heroku. as a farm sends out data, xively can respond to time series queries and heroku filters the data to check for issues.

do you communicate the other way, from your server to the farm?

yes, we do. our heroku app has a range of workflows and monitoring triggers, that when triggered, sends a command via mqtt down to the farm. the local client is also listening for messages from the heroku side.

there are also endpoints that allow our farmers to control the farm remotely. when farmers sign in to the farmhand app and want to do something like turn the lights on, a heroku app handles that request. it sends a message to the farm over mqtt, which then routes it to the farm’s automation controller to initiate the action.

what is the advantage to using mqtt?

in general, this mqtt architecture has worked really well for us. we were originally using https and a raspberry pi for sending data from the farm, but we’ve upgraded to a proper gateway, which gives us better security and enables sending data to the farm. because mqtt is such a strong iot standard now, it was very straightforward to configure our one-way https infrastructure for two way tls mqtt.

with mqtt, you have one broker and n number of clients. each new message only needs to be sent over the wire from the farm once, but will end up on all connected clients and our server instantly. this both saves on bandwidth and scales well as we bring more farms and client apps online.

mqtt is built around the concept of channels (ex. /sensors), and each client can subscribe to and extend channels (ex. /sensors/{farmid}). xively has extended mqtt in an interesting way with the concept of templates. they don’t just have a /sensor channel that every device or client can subscribe to. instead, every farm has its own set of channels built from the template, e.g. /{farmid}/sensors. as new channels are created, they are propagated to all devices. this adds a nice layer of security and separation, ensuring that messages only get to the intended device. from a development standpoint, it makes it that much easier to work with the protocol as each device shares the same properties.

xively also adds the concept of organizations, allowing only clients and devices in the same organization to actually send messages to each other. this added logic on the broker means there is nothing extra for us to do to get full acl out of the box.

our message volume is very high. each new request, such as “calibrate sensors,” initiates a specific session that subscribes to a farm channel. so if i create a new channel for a calibration, it will propagate to every device on every farm, making it a lot easier for us.

mqtt can be challenging in a multi-dyno setup, how are you using it?

generically subscribing to a topic on heroku is challenging, as each dyno will receive the message at the same time. however, sending messages from a running process over mqtt using websockets works really well, and that’s how we are using it.

we have a number of workflows and monitoring triggers that start up a process on heroku, this could be a farmer hitting “start calibration” on their mobile app, or the server automatically stopping a pump once the tank is full. for user workflows, when one is started, the farmhand app is listening to our server on heroku, and the server uses mqtt to start a session subscribing to that farm’s specific channel for the workflow. this actually allows us to do less filtering on all inbound message because the server process and app subscribe to the actual topic url associated with that farm.

interestingly, we’ve found it faster to do tasks such as calibration through our cloud-based system rather than locally using a plc (programmable logic controller) for the farm.

can you tell us more about the hardware and sensors?

each farm has 40 pieces of environmental equipment that are controlled by an automation controller that uses the modbus protocol. these pieces of equipment are for anything climate-related or mechanical, such as lights, pumps, and fans. every time a piece of equipment changes state, (e.g. “the pump turned off?”), we send this info up through the gateway to our server. at any point, heroku or a client app can send the appropriate command down to the hardware, (e.g. “lights turn off”).

each farm has 10 sensors. some track climate-related conditions, such as humidity, co2, and temperature. others track conditions in our growing tanks, such as nutrients or ph levels in the water.

do you push firmware updates?

our very first farm that started 4 years ago is still running the same automation controller, which didn’t ship with ota firmware updates. so a big consideration as we deployed our heroku / xively infrastructure was backwards compatibility. our gateway has solved this issue, it augments the existing controller and it’s able to update itself with all our newer features through server-side logic. that’s a big evolution for our company—extending functionality to push firmware updates remotely, without requiring the user to do them manually onsite.

so, instead of having to go onsite to a farm to update firmware or fine-tune hardware, we can optimize the farm from anywhere through our app. we can also learn from optimizing a single farm and apply it to other farms and crops.

how are you applying what you’ve learned?

because we can access any or all of our farms on demand, we’re building out new workflows that help us optimize farm operations based on real usage data.

we’ve written monitoring logic on heroku to handle scenario and location-specific conditions. for example, if a farm is located in a cool climate region, we control the farm’s internal temperature by cooling it via the intake fans rather than ac in order to save energy.

our next area of opportunity is crop optimization. for example, if you’re growing basil, what are the right conditions that will produce the best crop in your area? we’re starting to analyze yield data, but it’s still a very manual process. the new track feature in farmhand just launched a couple of weeks ago, which will help aggregate and analyze data from all our farms. once we have historical data and algorithms in place, we’ll be able to strengthen our growing parameters for specific crops, as well as trigger alerts to farmers if conditions are suboptimal.

we’re finally hitting scale with 100 farms deployed. many of these farms are just coming online, so we’re just getting the data now to validate some of these ideal “growing recipes.”

what’s next for freight farms?

we’re in a “consume and learn” phase. the next big project will be exposing this data through farmhand analysis tools to help our farmers continue to grow the best product possible. we’ve also just launched our newest product, the lgc, which is our most connected farm yet, and ¼ the size of the lgm. as we continue down the connected path, we’ll continue to discover new and exciting use cases for our farms that weren’t possible before.


Information

many of the compelling and engaging application experiences we enjoy every day are powered by event-based systems; requesting a ride and watching its progress, communicating with a friend or large group in real time, or connecting our increasingly intelligent devices to our phones and each other. behind the scenes, similar architectures let developers connect separate services into single systems, or process huge data streams to generate real-time insights. together, these event-driven architectures and systems are quickly becoming a powerful complement to the relational database and app server models that have been at the core of internet applications for over twenty years.

at heroku, we want to make the power of this increasingly important model available to a broader range of developers, allowing them to build evented applications without the cost and complexity of managing infrastructure services. today we are making a big step towards that goal with the general availability of apache kafka on heroku. kafka is the industry-leading open source solution for delivering data-intensive applications and managing event streams, and is backed by an active open source community and ecosystem. by combining the innovation of the kafka community with heroku’s developer experience and operational expertise, we hope to make this compelling technology even more vibrant and widely adopted, and to enable entirely new use cases and capabilities.

introducing apache kafka on heroku

heroku’s fully managed offering provides all the power, resilience, and scalability of kafka without the complexity and challenges of operating your own clusters. here at heroku, it has been a critical part of our success in delivering fast data processing pipelines for our platform metrics, in creating a robust event stream for our api, and in unifying many of our systems and sources of data. in the ecosystem at large, kafka has been instrumental to the successful scaling of platforms like linkedin, uber, pinterest, netflix, and square.

creating your kafka clusters is as simple as adding the service to a heroku app:

heroku addons:create heroku-kafka -a sushi-app 

kafka-cli-high-res

features of kafka on heroku include:

  • automated operations: apache kafka is a complex, multi-node distributed system, providing power but requiring significant operational care and feeding. with kafka on heroku, all of those operational burdens disappear - like other heroku data services such as postgres, everything from provisioning, management, and availability is handled automatically. adding kafka to your application is now as simple as a single command line.

  • simplified configuration: optimizing kafka configuration and resourcing for different applications and use cases is often an art unto itself. apache kafka on heroku offers a set of preconfigured plan options tested and optimized for the most common use cases. available in both the common runtime and within heroku private spaces, kafka on heroku can be provisioned with the security and level of network isolation that meets your application needs.

  • service resiliency, upgrades, and regions: one of the more powerful capabilities of apache kafka on heroku is the self healing and automated recovery the service provides; if a broker becomes unavailable, the service will automatically replace and re-establish failed elements of the cluster. in addition, kafka on heroku clusters can be upgraded in place with no downtime all the way up to multi-terabyte scale. like other heroku data services offered in private spaces, kafka on heroku is available in different regions and geographies, including both east and west coast united states, dublin, germany, japan and sydney.

  • dashboard and developer experience: at the core of apache kafka on heroku is a simple provisioning and management experience that fits seamlessly into the rest of the heroku platform. in addition to a robust set of cli commands and options, a new dashboard provides both visibility into the utilization and behavior of a given kafka cluster, as well as a simple configuration and management ui.

the plans now available are dedicated clusters, optimized for high throughput and high volume. we will continue to extend this range of plans to cover a broader set of needs, and to make evented architectures available to developers of all stripes.

kafka and event architectures

events are everywhere in modern application development, and they are becoming increasingly important. they’re the lifeblood of everything from activity streams, and iot devices, to mobile apps, and change data capture. leveraging these event streams is more important than ever, but doing so requires employing a new set of concepts.

traditional data architectures focused on transactions and remote procedure calls. these approaches are relatively easy to reason about at certain levels of scale. as systems scale, deal with intense data throughput, and integrate diverse services, developers find that there are very real limits to the transactional and service to service models.

kafka handles event streams with ease, and provides a solid foundation for designing the logic of your data flows, unifying various data systems, and enabling real-time processing. kafka also delivers a small but elegant set of abstractions to help developers understand critical facets of event-driven systems like event delivery, ordering, and throughput.

kafka enables you to easily design and implement architectures for many important use cases, such as:

  • elastic queuing

    kafka makes it easy for systems to accept large volumes of inbound events without putting volatile scaling demands on downstream services.these downstream services can pull from event streams in kafka when they have capacity, instead of being reactive to the “push” of events. this improves scaling, handling fluctuations in load, and general stability.

  • data pipelines & analytics

    applications at scale often need analytics or etl pipelines to get the most value from their data. kafka’s immutable event streams enable developers to build highly parallel data pipelines for transforming and aggregating data. this means that developers can achieve much faster and more stable data pipelines than would have been possible with batch systems or mutable data.

  • microservice coordination

    many applications move to microservice style architectures as they scale, and run up against the challenges that microservices entail: service discovery, dependencies and ordering of service availability, and service interaction. applications that use kafka for communication between services can simplify these design concerns dramatically. kafka makes it easy for a service to bootstrap into the microservice network, and discover which kafka topics to send and receive messages on. ordering and dependency challenges are reduced, and topic based coordination lowers the overhead of service discovery when messages between services are durably managed in kafka.

get started with kafka today

apache kafka on heroku is available in all of heroku’s runtimes, including private spaces. again, to get started, provision kafka from heroku elements or via the command line. we are excited to see what you build with kafka! if you have any questions or feedback, please let us know at [email protected] or attend our technical session on nov 3rd to see a demo and get your questions answered!


Information

encrypted communication is now the norm for applications on the internet. at heroku, part of our mission is to spread encryption by making it easy for developers to setup and use ssl on every application. today we take a big step forward in that mission by making heroku ssl generally available, allowing you to easily add ssl encryption to your applications with nothing more than a valid ssl certificate and custom domain.

heroku ssl is free for custom domains on hobby dynos and above and relies on the sni (“server name indication”) extension which is now supported by the vast majority of browsers and client libraries. the current ssl endpoint will remain available for the increasingly rare instances where your applications need to support legacy clients and browsers that do not support sni.

we had an overwhelmingly positive response to our beta launch, and are looking forward to having more and more users and teams use the new ssl service.

encryption as the default, made easy

the first step for using heroku ssl is getting an ssl certificate. upload the certificate either via the dashboard on your application’s settings page or from the cli. with this release we’ve made it even easier to complete the ssl certificate setup process in dashboard :

ssl certificate interface

to upload via the cli, use the heroku certs command as following:

$ heroku certs:add example.pem example.key 

when done with the certificate, the last step before starting to use heroku ssl is updating your dns to its new dns target. have a look at the documentation in dev center for full details.

note that our previous add-on, ssl endpoint, will continue to exist. however, we highly recommend that you switch to heroku ssl as we will be rolling out exciting new features to it over the coming months. in case you currently have an ssl endpoint and would like to switch, we have some guidelines here on how to migrate from ssl endpoint to heroku ssl.

feedback

we hope these changes make security on heroku more solid and easier to access and set up for all users . your feedback is welcome and highly appreciated. please write to us by selecting product feedback here: product feedback.


Information

austen ito is a software engineer at leading online fashion brand bonobos, based in new york. read our bonobos customer story for more information about how heroku has helped their business.

what do you have running on heroku?

we’re running just about everything on heroku, including our bonobos.com website, cross-app messaging services, an api for our erp, as well as some internal tools. the only pieces that are not on heroku are the data science and erp components. we’re also using desk.com for customer service queuing.

walk us through your stack

we use a mix of backbone and react in terms of javascript frameworks on the front end. some of our legacy work is in backbone and our newer work is in react, so we’re slowly moving the old work to react. we’re using a flux-like framework called redux.

on the backend, we use rails and solidus, a fork of spree, the open source e-commerce engine built on top of rails. the solidus project was a great bootstrap for our api, admin, and data model, as well as our store customization.

are you taking a microservices approach?

yes, but not quite. we have services that are somewhat micro, but not everything is a microservice. i’ve seen that done to the extreme, and that approach is not necessarily one that we’d want to take here with the size of our team.

what database are you using?

we’re using heroku postgres premium. we find the automatic promotion of the follower is great for us. we use read-only replicas for reporting and store them in our redshift data warehouse. there are also internal users who need read-only access, such as our analytics and data science teams.

what heroku add-ons are you using?

we use a few, including heroku postgres, redis cloud, librato, logentries, heroku scheduler, ssl, and rollbar.

rollbar is a really nice product for error handling, both front- and back-end. it groups errors really well together so you can do analysis and chase errors throughout the system. it has nice alerting rules around reactivation errors or error count notification.

we also use skylight, which is a rails instrumentation engine for app performance monitoring. their agent is lightweight and you get a good sense of the long tail of your performance, rather than just the median or worst case.

what technology do you use for your guideshops (brick-and-mortar versions of bonobos)?

our guideshop model is different than the traditional retail store in that it focuses on providing customer service and removing the nuances associated with it. that in-store experience is enhanced with technology we’re using here at bonobos.com. we work with a company called tulip. they built the ios tablet application for our sales associates and it draws everything through our api. employees are able to put in orders, do returns, as well as help people with their wish list and size preferences. all products, orders, everything is still administrable, still lives in our admin backend, but we don’t have to maintain that interface.

thank you for your time!

thank you!  


Information

at heroku, we're always working towards increased operational stability with the services we offer. as we recently launched the beta of apache kafka on heroku, we've been running a number of clusters on behalf of our beta customers.

over the course of the beta, we have thoroughly exercised kafka through a wide range of cases, which is an important part of bringing a fast-moving open-source project to market as a managed service. this breadth of exposure led us to the discovery of a memory leak in kafka, having a bit of an adventure debugging it, and then contributing a patch to the apache kafka community to fix it.

issue discovery

for the most part, we’ve seen very few issues running kafka in production. the system itself is very stable and performs very well even under massive amounts of load. however, we identified a scenario where processes in a long-running cluster would no longer fully recover from a restart. the process would seemingly restart successfully but always crash several seconds later. this is troubling to us; a significant part of safely managing and operating a distributed system like kafka is the ability to restart individual members of a cluster. we started looking into this as a high priority problem.

fortunately, an internal team had a staging cluster where we could reproduce the issue to investigate further without impacting a production app. we quickly figured out a remediation that we could apply in case we hit this issue on a production cluster before we addressed the bug, and then got to debugging the issue at hand.

a tale of failure to debug

i'm going to discuss the process of debugging this failure, including all the dead ends we encountered. many debugging stories jump straight to "we found this really difficult bug", but leave out all the gory details and dead ends chased. this is a false picture of fixing gnarly bugs -- in reality nearly all of them involve days of going down dead ends and misunderstanding.

given that we were able to reproduce the issue consistently, we began our investigation by simply starting the process and looking at the resources on the server using top and similar diagnosis tools. it was immediately clear this was a memory issue. the process would boot, allocate memory at a rate of around 400mb/s without freeing any, and then crash after exhausting all the memory on the server.

we confirmed this by looking at stdout logs of the process. they contained this wonderful snippet:

# there is insufficient memory for the java runtime environment to continue. # native memory allocation (mmap) failed to map 131072 bytes for committing reserved memory. # possible reasons: # the system is out of physical ram or swap space # in 32 bit mode, the process size limit was hit ... 

we now had two indications that there was a memory issue of some kind! this is progress - we can now start looking at memory allocation, and mostly ignore all other issues. typically, jvm memory is allocated on the heap, and java has a bunch of tooling to understand what's on the heap. we broke open the logs emitted by the jvm's garbage collector and found a mystery inside. the gc logs were almost empty, indicating the program wasn't under heap pressure at all. furthermore, what little information we did have indicated that this broker only ever had 100 megabytes of on-heap memory used. this didn't make sense given our previous evidence from top and stdout.

jvm memory: a recap

the jvm mostly allocates memory in two ways:

  • on-heap
  • off-heap

on-heap memory represents the majority of memory allocation in most jvm programs, and the garbage collector manages allocation and deallocation of on-heap memory. some programs do use notable amounts of "off-heap" or native memory, whereby the application controls memory allocation and deallocation directly. kafka shouldn't typically be using a lot of off-heap memory, but our next theory is that it must be doing exactly that. clearly, this is the only alternative, right? it can't be on-heap memory, or we'd see more information in the gc logs.

to test this theory, we wrote a small script to query the jmx metric java.lang:type=memory, which tells you about how much on- and off-heap memory the jvm thinks it is using. we ran this script in a tight loop while starting the broker, and saw, to our frustration, nothing useful:

... heap_memory=100 mb offheap_memory=63.2 mb heap_memory=101 mb offheap_memory=63.2 mb heap_memory=103 mb offheap_memory=63.2 mb ... crash 

neither on-heap nor off-heap memory was being used! but what else can even allocate memory? at this point, we reached further into our debugging toolkit.

tracing: a magical debugging tool

tracing is a very effective debugging tool, often employed in this kind of situation. are the logs not telling you what you need to know? time to dump out relevant events from a tracing tool and start looking through them for ideas.

in this particular case, we used sysdig, an especially powerful tool for debugging issues on a single server. sysdig allows you to capture system calls, much like the more venerable strace. a syscall is the mechanism by which a userland process communicates with the kernel. seeing as most kinds of resource usage involve talking to the kernel, looking at syscalls is a very effective way to diagnose this kind of issue.

sysdig is best used in a "capture, then analyze" mode, much like tcpdump. this lets you write all the syscalls emitted by a process to a file, and then take your time analyzing them.

the command:

sudo sysdig 'proc.name=java' -w ~/sysdig.scap 

will capture to a file all syscalls emitted by the process named java. we ran this command, then booted our broker and watched it crash.

now we have a file full of syscalls, what do we look at? the capture file, in this case, was 434mb, which you can't "just read through". sysdig gives you a suite of analysis tools for looking at the events emitted by the process. in this case, we're interested in memory allocation, so we're interested in the syscalls mmap and munmap for the most part.

the issue we're debugging is that somewhere, kafka is allocating memory and never freeing it. remember, this isn't on-heap memory or off-heap memory, so something is doing native memory allocation.

firing up sysdig, we see that this program does indeed allocate a lot of memory using mmap syscalls. analysis using bash scripting reveals that 9gb of memory is being allocated using mmap during this run. this is more memory than was available on the server, which seems to point in the right direction. when memory is allocated by mmap, the caller has to call 'munmap' eventually to release it back to the operating system. not releasing memory back to the operating system is the definition of a memory leak and will cause the process to crash after the leak has used all available memory.

a quick but complex sysdig query reveals this to be the case:

$ sudo sysdig -r ~/sysdig.scap 'evt.type=mmap' -p '%proc.vmsize %evt.dir %evt.type %evt.info' | grep 'length=' | wc -l 2551 $ sudo sysdig -r ~/sysdig.scap 'evt.type=munmap' -p '%proc.vmsize %evt.dir %evt.type %evt.info' | grep 'length=' | wc -l 518 

a lot is going on in this query -- sysdig is an incredibly powerful tool. this specific query allows us to see the memory usage of the process at the moment in time when the syscall was traced (that's what %proc.vmsize does in the format argument). while here, we are just counting events, we also examined them for patterns, and this memory usage output was invaluable there.

at this point, we were stumped. we are allocating a lot of memory, but not freeing it. the mmap calls didn't have any particular pattern that we could determine. at this stage in debugging, it's often time to take a break, and let your unconscious mind think up some possibilities. recognize that you are human: getting rest helps with debugging just as much as staring at a screen.

why is all the ram gone?

a short while later, we had a hunch to follow: it's something to do with the cluster being configured to use gzip to compress data.

to provide context, kafka can use compression to dramatically reduce data volumes transferred, both over the network and to disk. this particular cluster had gzip compression enabled. java exposes gzip compression in the jvm classes gzipinputstream and gzipoutputstream. those classes are backed entirely by a c library, zlib, which does its own memory allocation.

a quick google search for "jvm gzip memory leak", and we came across this article, which describes almost exactly what we were seeing. that article describes a usage pattern with gzip that causes the jvm to run out of memory very easily. you open a gzipinputstream or gzipoutputstream and never close it when finished with it. this explains why the jvm didn't show this memory as either on-heap or off-heap -- it can't know what memory this c library uses.

we broke open the kafka source code, and found a point where it was opening a gzipinputstream to decompress some data, but never closing it. upon a restart, kafka has to recover recently written messages, which involves decompressing them and checking them for validity. most code in kafka that does decompression reads the whole stream of compressed messages then closes the decompressor. this validity check works differently. it opens the compressed messages, reads the first message and checks its offset. it then short-circuits, failing to ever close the native buffers! this leads to our memory leak upon restart. as it so happens, this code path is never hit during normal operation. it only happens on restarts, which fits our evidence as well.

we confirmed this memory issue by reproducing the bug in a development version of kafka and then failing to reproduce after applying a patch that closes the gzipinputstream when short-circuiting.

this is often how debugging difficult bugs goes: hours and days of no progress staring at code and application logs and trace logs, and then some hunch points you in the right direction and is trivial to confirm. trying different tools and different ways of looking at the problem help you get to a hunch.

giving back

having found and diagnosed the issue, we sent in an upstream ticket to the kafka project and started working on a patch. after some back and forth review from the kafka committers, we had a patch in trunk, which is included in the brand new release of kafka 0.10.0.1. it's interesting to note how small the resulting patch was - this was a major bug that meant kafka couldn't boot in certain situations, but the bug and resulting bug fix were both very simple. often a tiny change in code is responsible for a huge change in behavior.

kafka 0.10.0.1 is now the default version on apache kafka on heroku. for those of you in the apache kafka on heroku beta, you can provision a new cluster that doesn't have this issue. likewise, we have tooling in place to upgrade existing versions.

for our beta customers, the command

heroku kafka:upgrade heroku-kafka --version 0.10 

where heroku-kafka is the name of your heroku kafka cluster, will perform an in-place upgrade to the latest version of 0.10 (which is 0.10.0.1 at the time of writing).

if you aren’t in the beta, you can request access here: https://www.heroku.com/kafka.


Information

so you want to build an app with react? "getting started" is easy… and then what?

react is a library for building user interfaces, which comprise only one part of an app. deciding on all the other parts — styles, routers, npm modules, es6 code, bundling and more — and then figuring out how to use them is a drain on developers. this has become known as javascript fatigue. despite this complexity, usage of react continues to grow.

the community answers this challenge by sharing boilerplates. these boilerplates reveal the profusion of architectural choices developers must make. that official "getting started" seems so far away from the reality of an operational app.

new, zero-configuration experience

inspired by the cohesive developer experience provided by ember.js and elm, the folks at facebook wanted to provide an easy, opinionated way forward. they created a new way to develop react apps, create-react-app. in the three weeks since initial public release, it has received tremendous community awareness (over 8,000 github stargazers) and support (dozens of pull requests).

create-react-app is different than many past attempts with boilerplates and starter kits. it targets zero configuration [convention-over-configuration], focusing the developer on what is interesting and different about their application.

a powerful side-effect of zero configuration is that the tools can now evolve in the background. zero configuration lays the foundation for the tools ecosystem to create automation and delight developers far beyond react itself.

zero-configuration deploy to heroku

thanks to the zero-config foundation of create-react-app, the idea of zero-config deployment seemed within reach. since these new apps all share a common, implicit architecture, the build process can be automated and then served with intelligent defaults. so, we created this community buildpack to experiment with no-configuration deployment to heroku.

create and deploy a react app in two minutes

you can get started building react apps for free on heroku.

npm install -g create-react-app create-react-app my-app cd my-app git init heroku create -b https://github.com/mars/create-react-app-buildpack.git git add . git commit -m "react-create-app on heroku" git push heroku master heroku open 

try it yourself using the buildpack docs.

growing up from zero config

create-react-app is very new (currently version 0.2) and since its target is a crystal-clear developer experience, more advanced use cases are not supported (or may never be supported). for example, it does not provide server-side rendering or customized bundles.

to support greater control, create-react-app includes the command npm run eject. eject unpacks all the tooling (config files and package.json dependencies) into the app's directory, so you can customize to your heart's content. once ejected, changes you make may necessitate switching to a custom deployment with node.js and/or static buildpacks. always perform such project changes through a branch / pull request, so they can be easily undone. heroku's review apps are perfect for testing changes to the deployment.

we'll be tracking progress on create-react-app and adapting the buildpack to support more advanced use cases as they become available. happy deploying!


Information

today we're announcing two new features that will help you better manage and run apps on heroku: threshold alerting and hobby dyno metrics. threshold alerting provides the ability to set notification thresholds for key performance and health indicators of your app. we’ve also extended basic application metrics to hobby dynos to provide basic health monitoring and application guidance. together these features allow you to stay focused on building functionality by letting the platform handle your app monitoring.

threshold alerting

there are many ways to measure the health of an application. the new alerting feature focuses on what is most important to the end users of your app: responsiveness and request failures.

responsiveness is measured by tracking the maximum response time for 95% of the requests, known as the 95th percentile (or p95) response time. when measuring response time to gauge how users are experiencing your app, it is important to focus on the overall distribution of responses and not just the average response time of requests. the 95th percentile measure has become a best practice in the industry and heroku has always tracked this measure for your apps and displayed it on the metrics view. now you can define alerts that are triggered if the p95 response time exceeds the threshold you specified.

in addition to response time, the request error rate is an important indicator of application health. if the percentage of failed requests suddenly spikes, you can now be alerted that something might be wrong with your app.

alerting is available on apps running on professional dynos (standard-1x and up).

alert configuration

first, select a metric on which to alert such as response time. then, enter the alerting threshold (minimum response time is 50ms). select the sensitivity setting, i.e. the minimum time the threshold must be breached for an alert to be triggered. options are high (1 minute), medium (5 minutes), and low (10 minutes). the alert simulation displays the alerts as they would appear based on the past 24 hours of your app’s data using the current threshold settings so that you can find a settings sweet-spot that won’t overwhelm your inbox.

alert-setup-v2

notifications setup

the default distribution for email notifications is all app owners and collaborators for non-org accounts and admins for those in a heroku enterprise org. an additional email address, including email-based pagerduty integration, is supported.

after configuring your alerts, you can specify the type(s) of email alert you wish to receive and the notification frequency (every 5 minutes, 1 hour, or 1 day), if applicable. leaving both options unchecked results in dashboard alerts only. lastly, activate your alert.

notification-setup-v2

dashboard notifications

on the application metrics tab, alerts appear in both the events table and on the corresponding plot.

events-chart

response-time-chart

hobby dyno metrics

traditionally, hobby dynos had no access to the application metrics ui. now basic metrics are available, providing data for the past 24 hours at 10-minute resolution. this includes errors, events, application guidance, response time, throughput, memory, and dyno load. standard 1x dynos and above provide metrics data with up to 1-minute resolution for a two hour time period. additionally, hobby users will now be able to try out labs features that previously would have only been available to higher level dyno types. for free dyno users considering an upgrade, hobby dyno metrics and heroku teams are two great reasons.

hobby-dyno-metrics

find out more

for more details on both features refer to the application metrics dev center article.

if there’s an additional alerting metric or feature you would like to see, drop us a line at [email protected]. we also send out a user survey 1-2 times a year that helps drive our operational experience product roadmap. if would like to be notified send us your email, or watch the metrics tab for a survey link.


Information

redis might sound like it’s just a key/value store, but its versatility makes it a valuable swiss army knife for your application. caching, queueing, geolocation, and more: redis does it all. we’ve built (and helped our customers build) a lot of apps around redis over the years, so we wanted to share a few tips that will ensure you get the most out of redis, whether you’re running it on your own box or using the heroku redis add-on.

use a connection pooler

by using a connection pooler, you'll reduce the connection overhead and therefore speed up operations while reducing the number of connections you use.

most redis libraries will provide you with a specific connection pooler implementation; you just have to make sure you use them. measure, compare, and adapt the size of your redis connection pool so that you get the best performance out of it.

you'll probably maintain a connection per dyno, so make sure your that the total amount of connections for each pool doesn’t exceed your add-on plan’s maximum connections.

give a name to your client

redis allows you to list connected clients using client list. this command will give you lots of useful information about them too:

client list id=2 addr=127.0.0.1:49743 fd=5 name=web.1 age=11 idle=0 flags=n db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=client 

one very simple way to make that view even more useful is to set a name to all of your connections. in order to do that you can use client setname. i would recommend setting them to your dyno name, using the dyno environment variable, so that redis receives a command like this:

client setname web.1 

now, you will be able to track and scan your connections at a quick glance.

set an appropriate key eviction policy

redis by default will never evict keys, which means that once your redis memory limit is reached, it will start returning errors if you try to create or update keys.

to save yourself from those errors, you should make sure you have an appropriate key eviction policy. here’s a quick rundown of where you might use each one:

  • caching only use cases: using allkeys-lru will remove the least recently used keys first, whether they are set to expire or not.
  • mixed usage: volatile will remove the least recently used keys first, but only for keys with an expiry set.
  • non-caching usage: the default noeviction policy will make sure you don't lose keys or reach a redis limit unknowingly (in which case it will return an error).

at the end of the day, these are just guidelines. as always, review your application usage and review all the policies available to you before making a decision.

avoid using keys

keys is a useful command during development or debugging, but it can result in decreased performance if used when you are in production. keys is an o(n) operation, which means that its performance is directly related to the number of keys that you’re looking for. if it’s absolutely necessary for you to go through a list of keys, consider using scan.

set an appropriate connection timeout

by default, redis will never close idle connections, which means that if you don't close your redis connections explicitly, you will lock yourself out of your instance.

to ensure this doesn't happen, heroku redis sets a default connection timeout of 300 seconds. this timeout doesn’t apply to non-publish/subscribe clients, and other blocking operations.

ensuring that your clients close connections properly, and that your timeout value is appropriate for your application will mean you never run out of connections.

use multi

multi is a very useful command. it allows for a set of operations to be executed with some atomic guarantees. you can think of it as a basic transaction-like semantic, here is an example of how to use it:

multi hmset atomic name "project manhattan" location "los alamos" created 1942 zadd history 1942 "atomic" exec 

this provides us with some atomic guarantees, but not full acid compliance. if your application require stronger guarantees, you should consider using postgres, which provides you a wide set of isolation levels, plus the ability to rollback.

what are your tips?

redis is powerful and versatile, and though we see a lot of useful patterns from our customers and in our own apps, we can’t possibly see them all. share what you’ve learned on the #redistips hashtag, and spread the good word.


Information

scott raio is co-founder and cto of combatant gentlemen, a design-to-delivery menswear e-commerce brand. read our combatant gentlemen customer story to learn more about how heroku helped them build a successful online business.

what microservices are you running in heroku private spaces?

we’ve written an individual service for every business use case. for example, we have services for order processing, product catalog, account management, authentication, swatch display, pos, logistics, payments, etc.

with all these different services, we chose heroku private spaces as a way to make service discovery easier. we’re currently running about 25 services, which is a relatively small number compared to netflix or twitter (who employ hundreds of services). but we’re growing, and we’re always evaluating our services to determine which ones are too large and need to be broken out.

most of our services work autonomously and share nothing between them. when the services are isolated and containerized, then changes are much simpler. it’s a very clean approach.

what languages did you use to write your microservices?

most of our services are written in ruby, due to the language’s development speed and flexibility. we leverage grape, a ruby framework for creating rest-like apis.

we use go for some services due to its raw speed. for example, our api gateway is written in go. it consumes all the host and service information, and does various things such as circuit breaking, fail over, and error management. we’re really excited about that and just deployed it in heroku private spaces.

the client-facing service that backs our uis is written in node.js and powers our backbone and marionette front-end apps. we use mongodb for all our database needs.

how do you handle service discovery?

we needed a microservices “chassis” – something that could create a platform for us to rapidly deploy services without having to worry too much about the inner workings of those services. we found many piecemeal technologies that could address different parts of the solution we needed, but nothing that could glue them all together.  so we built our own framework that we call “vine” (relating to “grape”) that connects all the services together and helps us maintain shared code between each service. the shared code aspect is more about the developer experience than service functionality. the framework’s plug-in architecture lets us write new core functionality that is modular.  anytime we need to add functionality to a group of services, we can write code once and share it everywhere. for example, this allows us to manage service configuration of our fastly cdn really easily.

how did you decide to create your own framework to manage your microservices?

there was nothing in the ruby world that worked for us. we’re not dealing with large amounts of data, but we’re dealing with a very broad and rich dataset with many different constituents touching that data. also, our industry is greatly influenced by changing conditions in local markets and with global suppliers. we needed the ability to respond quickly with new features or updates.

in a sense, we modeled our framework after our experience with heroku. we didn’t want our engineers worrying about inter-service discovery and service communication. instead, we wanted them to focus on building features and launching product. this was why we switched to heroku a year and a half ago, and it’s the same reason we built our vine framework.

we’re really excited about heroku’s upcoming dns service discovery feature that’s coming out soon. it will help us stabilize a lot of aspects of our system and we plan to integrate it into our existing architecture.

how do you handle inter-service communication?

we use rabbitmq for asynchronous message passing and http with messagepack or zeromq for our synchronous communication.

we’ve explored ideas around using kafka, but we’ve invested a lot of time in rabbitmq so we’ll continue using that for the time being. we’ve been able to deploy really fast serialization between services, which is something we could only do in heroku private spaces.  because it’s an isolated environment, we can talk to services directly rather than having to go to the internet.

what has been your experience working with heroku private spaces?

we’ve worked really closely with heroku support and the private spaces team on getting our service discovery working smoothly. heroku private spaces has given us all the benefits of the heroku platform combined with the architectural flexibility we needed to simplify private dyno communication and service discovery. because we don’t have to go over the internet, things are inherently secure and can be optimized without authentication later.

our next step is to get down and dirty with our data layer, connecting our private space to an amazon vpn. this will allow us to connect our database to the same private network as our private space.

any advice to others looking to take a microservices approach?

to me, the world of microservices is like the wild west right now. there are so many opinions, directions, and projects out there, creating a lot of noise in this area. to figure out the right microservice approach for your project, you just have to dive in and go through it yourself. you have to experience the bad with the good in order to come up with a strong opinion about which way to go for your project or company.


Information

we recently launched apache kafka on heroku into beta. just like we do with heroku postgres, our internal engineering teams have been using our kafka service to power a number of our internal systems.

the big idea

the heroku platform comprises a large number of independent services. traditionally we’ve used http calls to communicate between these services. while this approach is simple to implement and easy to reason about, it has a number of drawbacks. synchronous calls mean that the top-level request time will be gated by the slowest backend component. also, internal api calls create tight point-to-point couplings between services that can become very brittle over time.

asynchronous messaging has been around a long time as an alternative architecture for communicating between services. instead of using rpc-style calls between systems, we introduce a message bus into our system. now to communicate between system a and b, we can simply have system a publish messages to the message bus, and system b can consume those messages whenever it wants. the message bus stores messages for a certain period of time, so communication can occur between system a and system b even if both aren’t online at the same time.

increasingly we are moving from our synchronous integration pattern to the asynchronous messaging integration pattern, powered by kafka. this creates much looser coupling between our services. this allows our services and (importantly!) our development teams to operate and to iterate more independently. the message stream produced by system a creates an abstract contract - as long as system a continues to publish a compatible stream of messages, then both systems a and b can be modified without regard to the other. even better, the producing system doesn’t need to know anything about the consuming system(s). we can add or remove consumers at any time.

compared to traditional message brokers, kafka offers a number of benefits. it offers blazing performance and scalability, with the ability to handle hundreds of thousands of messages per second. its architecture supports relatively long-term message storage, enabling consumers to read back in time many hours. and its simple log-oriented design provides good delivery guarantees without requiring any complex ack/nack protocol. finally, its multi-node architecture offers zero downtime (brokers within a cluster can be upgraded independently) and simple horizontal scalability. this makes kafka suitable for a large range of integration and stream processing use cases, all running against the same kafka cluster.

the platform event stream

our core internal service generates an abstract event stream representing all resource changes on the platform - we call this the platform event stream. we’ve built a number of variations of this stream, once on postgres, once with aws kinesis, and our latest version uses the kafka service.

as a globally shared service, kinesis throttles read bandwidth from any single stream. this sounds reasonable, but in practice means that adding additional consumers to a stream slows down all consumers of that stream. this resulted in a situation where we were reluctant to add additional consumers to the platform event stream. this encouraged us to re-implement the stream on kafka. we have been very happy with the minimal resources required to serve additional consumers - a single kafka cluster can easily serve hundreds of clients.

when we launched the kafka version of the platform event stream, we wanted to ease the transition for the clients of our existing kinesis stream. these clients expected an http-based interface and a managed authentication system. we had expansive plans to allow lots of different clients, and we wanted both to simplify the process of creating new clients as well as be able to control stream access at a fine-grained level.

so we decided to implement a simple proxy to front our kafka cluster. the proxy uses http post for publishing and a websocket channel for consuming. it also implements a custom client authentication scheme. the proxy offers a layer of manageability on top of our kafka cluster which allows us to support a large set of client use cases. it also allows us to protect our kafka cluster inside a secure heroku private space while still allowing controlled access from outside the space.

image

the proxy exacts a performance penalty relative to the native kafka protocol, but we’ve found that kafka is so fast that this penalty is acceptable for our requirements. some other teams at heroku with “bare metal” performance needs are using their own kafka clusters with native clients.

despite the trade-offs, we have been very happy with the results. we have more than ten different consumers of the platform event stream, and the minimal onramp costs for connecting to the proxy (requiring nothing more than a websocket client) are enabling that number to grow steadily. kafka’s robust scalability means that adding another consumer of the event stream costs almost nothing in additional resources, and this had led us to creating new consumers for lots of purposes that we never originally envisioned.

generalizing the architecture

the success of kafka plus the websocket proxy has encouraged us to generalize the proxy to support additional event streams beyond our original one. we have now opened up the proxy so that other teams can register new kafka topics hosted within our cluster. this gives them the kind of zero-administration service they expect with low cost and high scalability.

some features that we would like to support in the future include:

  • a schema registry to hold definitions for all events, both for discoverability and potentially message validation
  • message filtering
  • public consumers. eventually we hope to expose the event stream as a primitive to all clients of the heroku platform api.

confluent has some interesting open source offerings in these areas, including their own rest proxy for kafka and their schema registry for kafka.

a path to a new architecture

this asynchronous integration pattern aligns well with the broader architectural shift away from batch processing with relational databases towards real-time stream processing. rethinking your services as event-driven stream processors offers a path towards a much more agile, scalable, and real-time system but requires thinking very differently about your systems and the tools you are using. kafka can play a key role in enabling this new style of real-time architecture, and techniques like using an http proxy are effective tools for easing experimentation and adoption.

moving to a real-time, asynchronous architecture does require significant new ways of thinking. push channels must be created to notify users of system state asynchronously, rather than simply relying on the http request/response cycle. rethinking the notion of “persistent state” as a “point in time snapshot” rather than “canonical source of truth” implies a very different application architecture than the ones to which most engineers are accustomed. architecting for eventual consistency and compensating transactions requires developing new techniques, libraries, and tools.


Information

heroku bumped its bundler version to 1.13.7 almost a month ago, and since then we've had a large number of support tickets opened, many a variant of the following:

your ruby version is <x>, but your gemfile specified <y> 

i wanted to talk about why you might get this error while deploying to heroku, and what you can do about it, along with some bonus features provided by the new bundler version.

why?

first off, why are you getting this error? on heroku in our ruby version docs, we mention that you can use a ruby directive in your gemfile to specify a version of ruby. for example if you wanted 2.3.3 then you would need this:

# gemfile ruby "2.3.3" 

this is still the right way to specify a version, however recent versions of bundler introduced a cool new feature. to understand why this bug happens you need to understand how the feature works.

ruby version specifiers

if you have people on your team who want to use a more recent version of ruby, for example say ruby 2.4.0 locally, but you don't want to force everyone to use that version you can use a ruby version operator.

# gemfile ruby "~> 2.3" 

i don't recommend you do this since "2.3" isn't a technically valid version of ruby. i recommend using full ruby versions in the version specifier; so if you don't have a ruby version in your gemfile.lock bundle platform --ruby will still return a valid ruby version.

you can use multiple version declarations just like in a gem for example: ruby '>= 2.3.3', '< 2.5'.

this says that any version of ruby up until 3.0 is valid. this feature came in bundler 1.12 but wasn't made available on heroku until bundler 1.13.7.

in addition to the ability to specify a ruby version specifier, bundler also introduced locking the actual ruby version in the gemfile.lock:

# gemfile.lock ruby version ruby 2.3.3p222 

when you run the command:

$ bundle platform --ruby ruby 2.3.3p222 

you'll get the value from your gemfile.lock rather than the version specifier from your gemfile. this is to provide you with development/production parity. to get that ruby version in your gemfile.lock you have to run bundle install with the same version of ruby locally, which means when you deploy you'll be using a version of ruby you use locally.

did you know this is actually how heroku gets your ruby version? we run the bundle platform --ruby command against your app.

so while the version specifier tells bundler what version ranges are "valid" the version in the gemfile.lock is considered to be canonical.

an error by any other name

so if you were using the app before with the specifier ruby "~> 2.3" and you try to run it with ruby 1.9.3 you'll get an error:

your ruby version is 1.9.3, but your gemfile specified ~> 2.3 

this is the primary intent of the bundler feature, to prevent you from accidentally using a version of ruby that may or may not be valid with the app. however if heroku gets the ruby version from bundle platform --ruby and that comes from the gemfile and gemfile.lock, how could you ever be running a version of ruby on heroku different from the version specified in your gemfile?

one of the reasons we didn't support bundler 1.12 was due to a bug that allowed incompatible gemfile and gemfile.lock ruby versions. i reported the issue, and the bundler team did an amazing job patching it and releasing the fix in 1.13.

what i didn't consider after is that people might still be using older bundler versions locally.

so what is happening is that people will update the ruby version specified in their gemfile without running bundle install so their gemfile.lock does not get updated. then they push to heroku and it breaks. or they're using an older version of bundler and their gemfile.lock is using an incompatible version of ruby locally but isn't raising any errors. then they push to heroku and it breaks.

so if you're getting this error on heroku, run this command locally to make sure your bundler is up to date:

$ gem install bundler successfully installed bundler-1.13.7 1 gem installed installing ri documentation for bundler-1.13.7... installing rdoc documentation for bundler-1.13.7... 

even if you haven't hit this bug yet, go ahead and make sure you're on a recent version of bundler right now. once you've done that run:

$ bundle install 

if you've already got a ruby version in your gemfile.lock you'll need to run

$ bundle update --ruby 

this will insert the same version of ruby you are using locally into your gemfile.lock.

if you get the exception locally your ruby version is <x>, but your gemfile specified <y> it means you either need to update your gemfile to point at your version of ruby, or update your locally installed version of ruby to match your gemfile.

once you've got everything working, make sure you commit it to git

$ git add gemfile.lock $ git commit -m "fix ruby version" 

now you're ready to git push heroku master and things should work.

when things go wrong

when these type of unexpected problems creep up on customers we try to do as much as we can to make the process easy for customers to understand the problem, and the fix. after seeing a few tickets come in, the information was shared internally with our support department (they're great by the way). recently i added documentation to dev center to document this explicit problem. i've also added some checks in the buildpack to give users a warning that points them to the docs. this is the best case scenario where not only can we document the problem, and the fix, but also add docs directly to the buildpack so you get it when you need it.

i also wanted to blog about it to help people wrap their minds around the fact that the gemfile is no longer the canonical source of the exact ruby version, but instead the gemfile.lock is. while the gemfile holds the ruby version specifier that declares a range of ruby versions that are valid with your app, the gemfile.lock holds the canonical ruby version of your app.

as ruby developers we have one of the best (if not the best) dependency managers in bundler. i'm excited for more people to start using version specifiers if the need arises for their app and i'm excited to support this feature on heroku.


Information

choices are an important part of a healthy open source software community. that’s why we’re excited about yarn, a new package manager that addresses many of the problems with node’s default package manager, npm. while npm has done a fantastic job creating a large and vibrant javascript ecosystem, i want to share why yarn is an important addition to the node.js ecosystem, how it will improve your node.js development experience, and how heroku has incorporated it into the build process for your heroku apps.

we began testing yarn almost immediately after it was released, and began fully supporting it on december 16.

about yarn

yarn was released in october 2016 and made a big splash immediately. and while it came out of facebook, yarn is a true open source project: it has a bsd license, clear contribution guidelines and code of conduct. big changes to yarn are managed through a public rfc process.

a few popular projects have switched to yarn. ruby on rails in version 5.1 switched to using yarn by default. jhipster, a java web app generator that incorporates spring boot, angular, and yeoman, has switched to yarn too.

why all the excitement?

yarn improves several aspects of dependency management for node developers. it pulls packages from the npm registry (via a proxy), which gives you access to the same huge selection of packages available via npm. but yarn gives you three big benefits:

  • predictable installs
  • security as a core value
  • performance

let’s look at each in more detail.

predictable installs

adding a new dependency to a node project can sometimes be a torturous task. it has become far too common for node developers to resort to rm -rf node_modules && npm install after encountering errors adding a new dependency to a project. you can kiss those days goodbye with yarn. an explicit design goal of yarn is deterministic dependency resolution.

what is deterministic dependency resolution? think of a pure function. if the same inputs go into a pure function, the same outputs always come out. dependency resolution should work the same way. if the same dependency list goes in, the same node_modules folder should come out.

why is this important?

we want to make every deploy on heroku a low risk event. it should be a fast and common part of your everyday development workflow. we also want to encourage dev / prod parity—the code that runs in your development environment should be exactly what runs in production.

running npm install on two different machines can result in different dependencies being installed on each machine. these two machines could be two developer machines. or, much more problematic, they could be a development machine and a production machine. the crux of this is that the dependency tree npm generates is determined by the order in which packages are installed. for more detail on this, i suggest you read this page from npm’s documentation.

how does yarn address this?

yarn’s lockfile is the key component making its dependency resolution deterministic. using the lockfile, yarn will generate the same dependency tree regardless of install order. this means it’s important that you commit your yarn.lock file to source control.

security as a core value

why is this important?

how do you know every time you or your deploy process runs npm install you’re getting the same code you got the first time? with npm, you don’t. for example, you might have a faulty cache or there might be an issue with the proxy you’re using. npm will install that different code and not provide a warning or error.

how does yarn address this?

yarn inspects the integrity of each package every time it is installed. it calculates a checksum to ensure you always get the same bits for a specified version of a package. if one line of code changes in left-pad v1.1.2, yarn’s next install of left-pad will fail.

performance

for most node projects, yarn will download and install dependencies faster than npm. note that i said most projects, not all projects. you can see yarn’s own measurements here, or you can check out the various 3rd party comparisons or do some tests yourself.

why does this matter?

while build predictability and security are most important to us, fast dependency installation is provides some great benefits. we want your deploys on heroku to be fast -- regardless of whether you’re deploying a review app, staging app, or production app. but even on your local development machine, yarn will get you faster initial project setup and faster addition or upgrade of dependencies. it means you can more easily stay undistracted in your flow state.

how does yarn address this?

whereas npm downloads dependencies sequentially, yarn downloads, compiles if necessary, and installs multiple dependencies in parallel. most computers and network connections these days are capable of installing more than one dependency at a time.

bonus round

so are you excited about yarn yet? if not, here are a few more yarn features that have been useful to me:

  • yarn why left-pad identifies why the left-pad package is installed, showing which other packages depend upon it.
  • yarn licenses generate-disclaimer generates a license disclaimer from installed dependencies.
  • yarn upgrade-interactive interactively upgrades specific dependencies.

yarn + heroku

heroku has full support for yarn. any node.js app with a yarn.lock file will be built using yarn instead of npm. without any additional work, using yarn on heroku will get you predictable, secure, and, possibly, faster deploys. of course, if you want or need to continue using npm, you can.

if you haven’t used yarn yet, i encourage you to check out the getting started guide. it took me just a few minutes to swap out npm for yarn on several existing projects, and just a few more minutes to figure out yarn’s simple commands.


Information

choices are an important part of a healthy open source software community. that’s why we’re excited about yarn, a new package manager that addresses many of the problems with node’s default package manager, npm. while npm has done a fantastic job creating a large and vibrant javascript ecosystem, i want to share why yarn is an important addition to the node.js ecosystem, how it will improve your node.js development experience, and how heroku has incorporated it into the build process for your heroku apps.

yarn logo

we began testing yarn almost immediately after it was released, and began fully supporting it on december 16.

about yarn

yarn was released in october 2016 and made a big splash immediately. and while it came out of facebook, yarn is a true open source project: it has a bsd license, clear contribution guidelines and code of conduct. big changes to yarn are managed through a public rfc process.

yarn-install-blog

a few popular projects have switched to yarn. ruby on rails in version 5.1 switched to using yarn by default. jhipster, a java web app generator that incorporates spring boot, angular, and yeoman, has switched to yarn too.

why all the excitement?

yarn improves several aspects of dependency management for node developers. it pulls packages from the npm registry (via a proxy), which gives you access to the same huge selection of packages available via npm. but yarn gives you three big benefits:

  • predictable installs
  • security as a core value
  • performance

let’s look at each in more detail.

predictable installs

adding a new dependency to a node project can sometimes be a torturous task. it has become far too common for node developers to resort to rm -rf node_modules && npm install after encountering errors adding a new dependency to a project. you can kiss those days goodbye with yarn. an explicit design goal of yarn is deterministic dependency resolution.

what is deterministic dependency resolution? think of a pure function. if the same inputs go into a pure function, the same outputs always come out. dependency resolution should work the same way. if the same dependency list goes in, the same node_modules folder should come out.

why is this important?

we want to make every deploy on heroku a low risk event. it should be a fast and common part of your everyday development workflow. we also want to encourage dev / prod parity—the code that runs in your development environment should be exactly what runs in production.

running npm install on two different machines can result in different dependencies being installed on each machine. these two machines could be two developer machines. or, much more problematic, they could be a development machine and a production machine. the crux of this is that the dependency tree npm generates is determined by the order in which packages are installed. for more detail on this, i suggest you read this page from npm’s documentation.

how does yarn address this?

yarn’s lockfile is the key component making its dependency resolution deterministic. using the lockfile, yarn will generate the same dependency tree regardless of install order. this means it’s important that you commit your yarn.lock file to source control.

security as a core value

why is this important?

how do you know every time you or your deploy process runs npm install you’re getting the same code you got the first time? with npm, you don’t. for example, you might have a faulty cache or there might be an issue with the proxy you’re using. npm will install that different code and not provide a warning or error.

how does yarn address this?

yarn inspects the integrity of each package every time it is installed. it calculates a checksum to ensure you always get the same bits for a specified version of a package. if one line of code changes in left-pad v1.1.2, yarn’s next install of left-pad will fail.

performance

for most node projects, yarn will download and install dependencies faster than npm. note that i said most projects, not all projects. you can see yarn’s own measurements here, or you can check out the various 3rd party comparisons or do some tests yourself.

why does this matter?

while build predictability and security are most important to us, fast dependency installation is provides some great benefits. we want your deploys on heroku to be fast -- regardless of whether you’re deploying a review app, staging app, or production app. but even on your local development machine, yarn will get you faster initial project setup and faster addition or upgrade of dependencies. it means you can more easily stay undistracted in your flow state.

how does yarn address this?

whereas npm downloads dependencies sequentially, yarn downloads, compiles if necessary, and installs multiple dependencies in parallel. most computers and network connections these days are capable of installing more than one dependency at a time.

bonus round

so are you excited about yarn yet? if not, here are a few more yarn features that have been useful to me:

  • yarn why left-pad identifies why the left-pad package is installed, showing which other packages depend upon it.
  • yarn licenses generate-disclaimer generates a license disclaimer from installed dependencies.
  • yarn upgrade-interactive interactively upgrades specific dependencies.

yarn-update-blog

yarn + heroku

heroku has full support for yarn. any node.js app with a yarn.lock file will be built using yarn instead of npm. without any additional work, using yarn on heroku will get you predictable, secure, and, possibly, faster deploys. of course, if you want or need to continue using npm, you can.

if you haven’t used yarn yet, i encourage you to check out the getting started guide. it took me just a few minutes to swap out npm for yarn on several existing projects, and just a few more minutes to figure out yarn’s simple commands.


Information

choices are an important part of a healthy open source software community. that’s why we’re excited about yarn, a new package manager that addresses many of the problems with node’s default package manager, npm. while npm has done a fantastic job creating a large and vibrant javascript ecosystem, i want to share why yarn is an important addition to the node.js ecosystem, how it will improve your node.js development experience, and how heroku has incorporated it into the build process for your heroku apps.

yarn logo

we began testing yarn almost immediately after it was released, and began fully supporting it on december 16.

about yarn

yarn was released in october 2016 and made a big splash immediately. and while it came out of facebook, yarn is a true open source project: it has a bsd license, clear contribution guidelines and code of conduct. big changes to yarn are managed through a public rfc process.

yarn-install-blog

a few popular projects have switched to yarn. ruby on rails in version 5.1 switched to using yarn by default. jhipster, a java web app generator that incorporates spring boot, angular, and yeoman, has switched to yarn too.

why all the excitement?

yarn improves several aspects of dependency management for node developers. it pulls packages from the npm registry (via a proxy), which gives you access to the same huge selection of packages available via npm. but yarn gives you three big benefits:

  • predictable installs
  • security as a core value
  • performance

let’s look at each in more detail.

predictable installs

adding a new dependency to a node project can sometimes be a torturous task. it has become far too common for node developers to resort to rm -rf node_modules && npm install after encountering errors adding a new dependency to a project. you can kiss those days goodbye with yarn. an explicit design goal of yarn is deterministic dependency resolution.

what is deterministic dependency resolution? think of a pure function. if the same inputs go into a pure function, the same outputs always come out. dependency resolution should work the same way. if the same dependency list goes in, the same node_modules folder should come out.

why is this important?

we want to make every deploy on heroku a low risk event. it should be a fast and common part of your everyday development workflow. we also want to encourage dev / prod parity—the code that runs in your development environment should be exactly what runs in production.

running npm install on two different machines can result in different dependencies being installed on each machine. these two machines could be two developer machines. or, much more problematic, they could be a development machine and a production machine. the crux of this is that the dependency tree npm generates is determined by the order in which packages are installed. for more detail on this, i suggest you read this page from npm’s documentation.

how does yarn address this?

yarn’s lockfile is the key component making its dependency resolution deterministic. using the lockfile, yarn will generate the same dependency tree regardless of install order. this means it’s important that you commit your yarn.lock file to source control.

security as a core value

why is this important?

how do you know every time you or your deploy process runs npm install you’re getting the same code you got the first time? with npm, you don’t. for example, you might have a faulty cache or there might be an issue with the proxy you’re using. npm will install that different code and not provide a warning or error.

how does yarn address this?

yarn inspects the integrity of each package every time it is installed. it calculates a checksum to ensure you always get the same bits for a specified version of a package. if one line of code changes in left-pad v1.1.2, yarn’s next install of left-pad will fail.

performance

for most node projects, yarn will download and install dependencies faster than npm. note that i said most projects, not all projects. you can see yarn’s own measurements here, or you can check out the various 3rd party comparisons or do some tests yourself.

why does this matter?

while build predictability and security are most important to us, fast dependency installation provides some great benefits. we want your deploys on heroku to be fast -- regardless of whether you’re deploying a review app, staging app, or production app. but even on your local development machine, yarn will get you faster initial project setup and faster addition or upgrade of dependencies. it means you can more easily stay undistracted in your flow state.

how does yarn address this?

whereas npm downloads dependencies sequentially, yarn downloads, compiles if necessary, and installs multiple dependencies in parallel. most computers and network connections these days are capable of installing more than one dependency at a time.

bonus round

so are you excited about yarn yet? if not, here are a few more yarn features that have been useful to me:

  • yarn why left-pad identifies why the left-pad package is installed, showing which other packages depend upon it.
  • yarn licenses generate-disclaimer generates a license disclaimer from installed dependencies.
  • yarn upgrade-interactive interactively upgrades specific dependencies.

yarn-update-blog

yarn + heroku

heroku has full support for yarn. any node.js app with a yarn.lock file will be built using yarn instead of npm. without any additional work, using yarn on heroku will get you predictable, secure, and, possibly, faster deploys. of course, if you want or need to continue using npm, you can.

if you haven’t used yarn yet, i encourage you to check out the getting started guide. it took me just a few minutes to swap out npm for yarn on several existing projects, and just a few more minutes to figure out yarn’s simple commands.


Information

the most innovative apps augment our human senses, intuition, and logic with machine learning. deep learning, modelled after the neural networks of the human brain, continues to grow as one of the most powerful types of machine learning. when applied to images, deep learning enables powerful computer vision features like visual search, product identification, and brand detection.

today, we bring you the einstein vision add-on (beta), allowing heroku developers to easily connect to and use einstein vision, a set of powerful new apis for building ai-powered apps. with this release, salesforce is making it easy for you to embed image recognition directly into your apps. rather than building and managing the specialized infrastructure needed to host deep learning models, simply connect to einstein vision's http/rest api for custom image recognition with little development overhead.

use einstein vision to discover your products across your social media channels, analyze observational data in healthcare and life science applications, and enable visual search in ecommerce apps to delight your customers. get started quickly with pre-trained image classifiers that are automatically available to you when you install the einstein vision add-on.

a simple workflow for custom image recognition

the true strength of einstein vision is its ability to train custom models to recognize the things you care about. creating a custom model takes just a few steps:

  1. plan
  2. collect
  3. train & evaluate
  4. query

let's walk through the workflow to create a brand recognizer for heroku logos and artwork, which is based on the einstein vision example app in node.js.

plan the model: label all the things

in machine learning, the “model” is the brain that answers questions, and “labels” are the possible answers. to have einstein vision recognize specific objects, we will train the model using example images for each label. for the example brand recognizer app, labels represent visual aspects of the heroku brand.

start with the labels of primary interest:

  • heroku logo, isolated logos
  • heroku artwork, various supporting imagery
  • heroku swag, t-shirts, socks, water bottles, etc.

then, think about images that do not contain one of the objects we want to recognize. how will the model answer those questions? let's plan a negative, catch-all label representing the infinite world of objects beyond our target labels:

  • unknown, a random set of things we don't care about

the unknown set is a curiosity at first. remember that the model can only answer questions it's been trained to answer. if you want a clear indication of the model not matching a label, then train negative labels as well.

collect example images

before diving into the actual machine learning, we must gather example images that represent each of the planned labels. each label needs a variety of example images: in isolation, in normal surroundings, from various angles, with various compositions, and with 2d & 3d representations. this avoids over-fitting the model, improving the flexibility in classification of unseen images. we collect examples by sorting them into a directory named for each label, preparing them for zip upload into a dataset.

while a model can be trained with very few images per label, more training examples will dramatically improve prediction accuracy for unseen images. we've built demos with just a few dozen examples per label, but at least a thousand images per label is recommended for high-confidence predictions.

train the model

once the example images are collected, we will use the rest/http api provided by the add-on to upload the dataset.

the steps to train a model are:

  1. upload example images
  2. initiate training
  3. check training status
  4. inspect the model's metrics

walkthrough the api flow with our example.

performance evaluation

after training, einstein vision automatically evaluates the model using cross-validation. it withholds a random 10% (k-fold = 10) of the example data to test the model. the accuracy of predictions from that portion of unseen data, the testaccuracy, represents how the model will perform in the real-world.

fetch model metrics from the api for any trained model to get its testaccuracy. additional metrics returned may indicate issues with examples confusing the algorithm or errors reducing the useful dataset.

to tune a model, revise the source dataset to address any issues and then create, train, and evaluate a new model. after tuning, the model with superior metrics may be considered production-ready.

query the model

once training is complete, the new model will answer queries to classify images by url reference or direct upload. here's an example query using the curl command-line tool:

$ curl -x post \ -f "samplecontent=@./path/to/image.jpg" \ -f "modelid=yyyyy" \ -h "authorization: bearer xxxxx" \ -h "content-type: multipart/form-data" \ https://api.metamind.io/v1/vision/predict 

example json response:

{ "probabilities": [ { "label": "heroku artwork", "probability": 0.53223926 }, { "label": "unknown", "probability": 0.46305126 }, { "label": "heroku swag", "probability": 0.0038324401 }, { "label": "heroku logo", "probability": 0.0008770062 } ], "object": "predictresponse" } 

pipeline to production

one of the wonderful things about heroku apps is that after you get a proof-of-concept running, add the app to a pipeline to enable enterprise-grade continuous delivery, including: review apps, ci tests (in private beta), and elegant promotion to production.

to share a model created in one app with other apps in a heroku pipeline, such as promoting an app from review app to staging and finally to production, the add-on must be shared between those apps.

only the beginning

we can’t wait to see what you build with the einstein vision add-on (beta). einstein vision is free to get started, and we plan to introduce paid plans at ga on march 14th. check out the add-on documentation, then dive in with our node example app or add it to your own app to try it out.


Information

the most innovative apps augment our human senses, intuition, and logic with machine learning. deep learning, modelled after the neural networks of the human brain, continues to grow as one of the most powerful types of machine learning. when applied to images, deep learning enables powerful computer vision features like visual search, product identification, and brand detection.

today, we bring you the einstein vision add-on (beta), allowing heroku developers to easily connect to and use einstein vision, a set of powerful new apis for building ai-powered apps. with this release, salesforce is making it easy for you to embed image recognition directly into your apps. rather than building and managing the specialized infrastructure needed to host deep learning models, simply connect to einstein vision's http/rest api for custom image recognition with little development overhead.

use einstein vision to discover your products across your social media channels, analyze observational data in healthcare and life science applications, and enable visual search in ecommerce apps to delight your customers. get started quickly with pre-trained image classifiers that are automatically available to you when you install the einstein vision add-on.

a simple workflow for custom image recognition

the true strength of einstein vision is its ability to train custom models to recognize the things you care about. creating a custom model takes just a few steps:

  1. plan
  2. collect
  3. train & evaluate
  4. query

let's walk through the workflow to create a brand recognizer for heroku logos and artwork, which is based on the einstein vision example app in node.js.

plan the model: label all the things

in machine learning, the “model” is the brain that answers questions, and “labels” are the possible answers. to have einstein vision recognize specific objects, we will train the model using example images for each label. for the example brand recognizer app, labels represent visual aspects of the heroku brand.

start with the labels of primary interest:

  • heroku logo, isolated logos
  • heroku artwork, various supporting imagery
  • heroku swag, t-shirts, socks, water bottles, etc.

then, think about images that do not contain one of the objects we want to recognize. how will the model answer those questions? let's plan a negative, catch-all label representing the infinite world of objects beyond our target labels:

  • unknown, a random set of things we don't care about

the unknown set is a curiosity at first. remember that the model can only answer questions it's been trained to answer. if you want a clear indication of the model not matching a label, then train negative labels as well.

collect example images

before diving into the actual machine learning, we must gather example images that represent each of the planned labels. each label needs a variety of example images: in isolation, in normal surroundings, from various angles, with various compositions, and with 2d & 3d representations. this avoids over-fitting the model, improving the flexibility in classification of unseen images. we collect examples by sorting them into a directory named for each label, preparing them for zip upload into a dataset.

while a model can be trained with very few images per label, more training examples will dramatically improve prediction accuracy for unseen images. we've built demos with just a few dozen examples per label, but at least a thousand images per label is recommended for high-confidence predictions.

train the model

once the example images are collected, we will use the rest/http api provided by the add-on to upload the dataset.

the steps to train a model are:

  1. upload example images
  2. initiate training
  3. check training status
  4. inspect the model's metrics

walkthrough the api flow with our example.

performance evaluation

after training, einstein vision automatically evaluates the model using cross-validation. it withholds a random 10% (k-fold = 10) of the example data to test the model. the accuracy of predictions from that portion of unseen data, the testaccuracy, represents how the model will perform in the real-world.

fetch model metrics from the api for any trained model to get its testaccuracy. additional metrics returned may indicate issues with examples confusing the algorithm or errors reducing the useful dataset.

to tune a model, revise the source dataset to address any issues and then create, train, and evaluate a new model. after tuning, the model with superior metrics may be considered production-ready.

query the model

once training is complete, the new model will answer queries to classify images by url reference or direct upload. here's an example query using the curl command-line tool:

$ curl -x post \ -f "samplecontent=@./path/to/image.jpg" \ -f "modelid=yyyyy" \ -h "authorization: bearer xxxxx" \ -h "content-type: multipart/form-data" \ https://api.metamind.io/v1/vision/predict 

example json response:

{ "probabilities": [ { "label": "heroku artwork", "probability": 0.53223926 }, { "label": "unknown", "probability": 0.46305126 }, { "label": "heroku swag", "probability": 0.0038324401 }, { "label": "heroku logo", "probability": 0.0008770062 } ], "object": "predictresponse" } 

pipeline to production

one of the wonderful things about heroku apps is that after you get a proof-of-concept running, add the app to a pipeline to enable enterprise-grade continuous delivery, including: review apps, ci tests (in private beta), and elegant promotion to production.

to share a model created in one app with other apps in a heroku pipeline, such as promoting an app from review app to staging and finally to production, the add-on must be shared between those apps.

only the beginning

we can’t wait to see what you build with the einstein vision add-on (beta). einstein vision is free to get started, and we plan to introduce paid plans at ga in a few weeks. check out the add-on documentation, then dive in with our node example app or add it to your own app to try it out.


Information

we are happy to announce the general availability of automated certificate management (acm) for all paid heroku dynos. with acm, the cumbersome and costly process of provisioning and managing ssl certificates is replaced with a simple experience that is free for all paid dynos on heroku’s common runtime. creating secure web applications has never been more important, and with acm and the let’s encrypt project, never easier.

acm handles all aspects of ssl/tls certificates for custom domains; you no longer have to purchase certificates, or worry about their expiration or renewal. acm builds directly on our recent release of heroku free ssl to make encryption the default for web applications and helps you protect against eavesdropping, cookie theft, and content hijacking. heroku has always made it easy to add ssl encryption to web applications — today’s release of acm extends that further to automatically generate a tls certificate issued by let’s encrypt for your application’s custom domains.

how it works

new applications

every time you upgrade from a free dyno to a hobby or professional dyno, we will automatically generate a tls certificate for all custom domains on your application. you will need to ensure that your application’s custom domains are pointed to the correct dns targets as specified in heroku domains.

existing applications

for existing applications, you can enable acm by simply going to your application’s settings page and clicking the “configure ssl” button.

or you can run the cli command:

$ heroku certs:auto:enable -a <app name> 

if your application was not using heroku ssl, update your dns settings for your custom domain to its new dns target and run heroku domains to verify.

run the following to verify whether your application’s domains are covered by automated certificate management:

$ heroku certs:auto 

for more details, including how to migrate from the ssl endpoint add-on, please see our dev center documentation.

feedback

we hope these changes make ssl/tls encryption easier to access and set up for all users. your feedback is welcome and highly appreciated. please write to us by selecting product feedback here: product feedback.


Information

we are happy to announce the general availability of automated certificate management (acm) for all paid heroku dynos. with acm, the cumbersome and costly process of provisioning and managing ssl certificates is replaced with a simple experience that is free for all paid dynos on heroku’s common runtime. creating secure web applications has never been more important, and with acm and the let’s encrypt project, never easier.

acm handles all aspects of ssl/tls certificates for custom domains; you no longer have to purchase certificates, or worry about their expiration or renewal. acm builds directly on our recent release of heroku free ssl to make encryption the default for web applications and helps you protect against eavesdropping, cookie theft, and content hijacking. heroku has always made it easy to add ssl encryption to web applications — today’s release of acm extends that further to automatically generate a tls certificate issued by let’s encrypt for your application’s custom domains.

how it works

new applications

every time you upgrade from a free dyno to a hobby or professional dyno, we will automatically generate a tls certificate for all custom domains on your application. you will need to ensure that your application’s custom domains are pointed to the correct dns targets as specified in heroku domains.

existing applications

for existing applications, you can enable acm by simply going to your application’s settings page and clicking the “configure ssl” button.

or you can run the cli command:

$ heroku certs:auto:enable -a <app name> 

if your application was not using heroku ssl, update your dns settings for your custom domain to its new dns target and run heroku domains to verify.

run the following to verify whether your application’s domains are covered by automated certificate management:

$ heroku certs:auto 

for more details, including how to migrate from the ssl endpoint add-on, please see our dev center documentation.


Information

we are happy to announce the general availability of automated certificate management (acm) for all paid heroku dynos. with acm, the cumbersome and costly process of provisioning and managing ssl certificates is replaced with a simple experience that is free for all paid dynos on heroku’s common runtime. creating secure web applications has never been more important, and with acm and the let’s encrypt project, never easier.

acm handles all aspects of ssl/tls certificates for custom domains; you no longer have to purchase certificates, or worry about their expiration or renewal. acm builds directly on our recent release of heroku free ssl to make encryption the default for web applications and helps you protect against eavesdropping, cookie theft, and content hijacking. heroku has always made it easy to add ssl encryption to web applications — today’s release of acm extends that further to automatically generate a tls certificate issued by let’s encrypt for your application’s custom domains.

how it works

new applications

every time you upgrade from a free dyno to a hobby or professional dyno, we will automatically generate a tls certificate for all custom domains on your application. you will need to ensure that your application’s custom domains are pointed to the correct dns targets as specified in heroku domains.

existing applications

for existing applications, you can enable acm by simply going to your application’s settings page and clicking the “configure ssl” button.

or you can run the cli command:

$ heroku certs:auto:enable -a <app name> 

if your application was not using heroku ssl, update your dns settings for your custom domain to its new dns target and run heroku domains to verify.

run the following to verify whether your application’s domains are covered by automated certificate management:

$ heroku certs:auto 

for more details, including how to migrate from the ssl endpoint add-on, please see our dev center documentation.


Information

this post is going to help save you money if you're running a rails server. it starts like this: you write an app. let's say you're building the next hyper-targeted blogging platform for medium length posts. when you login, you see a paginated list of all of the articles you've written. you have a post model and maybe for to do tags, you have a tag model, and for comments, you have a comment model. you write your view so that it renders the posts:

<% @posts.each do |post| %> <%= link_to(post, post.title) %> <%= teaser_for(post) %> <%= "#{post.comments.count} comments" <% end %> <%= pagination(@posts) %> 

see any problems with this? we have to make a single query to return all the posts - that's where the @posts comes from. say that there are n posts returned. in the code above, as the view iterates over each post, it has to calculate post.comments.count - but that in turn needs another database query. this is the n+1 query problem - our initial single query (the 1 in n+1) returns something (of size n) that we iterate over and perform yet another database query on (n of them).

introducing includes

if you've been around the rails track long enough you've probably run into the above scenario before. if you run a google search, the answer is very simple -- "use includes". the code looks like this:

# before @posts = current_user.posts.per_page(20).page(params[:page]) 

and after

@posts = current_user.posts.per_page(20).page(params[:page]) @posts = @posts.includes(:comments) 

this is still textbook, but let's look at what's going on. active record uses lazy querying so this won't actually get executed until we call @posts.first or @posts.all or @posts.each. when we do that two queries get executed, the first one for posts makes sense:

select * from posts where user_id=? limit ? offset ? 

active record will pass in user_id and limit and offset into the bind params and you'll get your array of posts.

note: we almost always want all queries to be scoped with a limit in production apps.

the next query you'll see may look something like this:

select * from comments where post_id in ? 

notice anything wrong? bonus points if you found it, and yes, it has something to do with memory.

if each of those 20 blog posts has 100 comments, then this query will return 2,000 rows from your database. active record doesn't know what data you need from each post comment, it'll just know it was told you'll eventually need them. so what does it do? it creates 2,000 active record objects in memory because that's what you told it to do. that's the problem, you don't need 2,000 objects in memory. you don't even need the objects, you only need the count.

the good: you got rid of your n+1 problem.

the bad: you're stuffing 2,000 (or more) objects from the database into memory when you aren't going to use them at all. this will slow down this action and balloon the memory use requirements of your app.

it's even worse if the data in the comments is large. for instance, maybe there is no max size for a comment field and people write thousand word essays, meaning we'll have to load those really large strings into memory and keep them there until the end of the request even though we're not using them.

n+1 is bad, unneeded memory allocation is worse

now we've got a problem. we could "fix" it by re-introducing our n+1 bug. that's a valid fix, however, you can easily benchmark it. use rack-mini-profiler in development on a page with a large amount of simulated data. sometimes it's faster to not "fix" your n+1 bugs.

that's not good enough for us, though -- we want no massive memory allocation spikes and no n+1 queries.

counter cache

what's the point of having cache if you can't count it? instead of having to call post.comments.count each time, which costs us a sql query, we can store that data directly inside of the post model. this way when we load a post object we automatically have this info. from the docs for the counter cache you'll see we need to change our model to something like this:

class comment < applicationrecord belongs_to :post , counter_cache: count_of_comments #… end 

now in our view, we can call:

 <%= "#{post.count_of_comments} comments" 

boom! now we have no n+1 query and no memory problems. but...

counter cache edge cases

you cannot use a counter cache with a condition. let's change our example for a minute. let's say each comment could either be "approved", meaning you moderated it and allow it to show on your page, or "pending". perhaps this is a vital piece of information and you must show it on your page. previously we would have done this:

 <%= "#{ post.comments.approved.count } approved comments" <%= "#{ post.comments.pending.count } pending comments" 

in this case the comment model has a status field and calling comments.pending is equivalent to adding where(status: "pending"). it would be great if we could have a post.count_of_pending_comments cache and a post.count_of_approved_comments cache, but we can't. there are some ways to hack it, but there are edge cases, and not all apps can safely accommodate for all edge cases. let's say ours is one of those.

now what? we could get around this with some view caching because if we cache your entire page, we only have to render it and pay that n+1 cost once. maybe fewer times if we are re-using view components and are using "russian doll" style view caches .

if view caching is out of the question due to <reasons>, what are we left with? we have to use our database the way the original settlers of the wild west did, manually and with great effort.

manually building count data in hashes

in our controller where we previously had this:

@posts = current_user.posts.per_page(20).page(params[:page]) @posts = @posts.includes(:comments) 

we can remove that includes and instead build two hashes. active record returns hashes when we use group(). in this case we know we want to associate comment count with each post, so we group by :post_id.

@posts = current_user.posts.per_page(20).page(params[:page]) post_ids = @posts.map(&:id) @pending_count_hash = comment.pending.where(post_id: post_ids).group(:post_id).count @approved_count_hash = comment.approved.where(post_id: post_ids).group(:post_id).count 

now we can stash and use this value in our view instead:

 <%= "#{ @approved_count_hash[post.id] || 0 } approved comments" <%= "#{ @pending_count_hash[post.id] || 0 } pending comments" 

now we have 3 queries, one to find our posts and one for each comment type we care about. this generates 2 extra hashes that hold the minimum of information that we need.

i've found this strategy to be super effective in mitigating memory issues while not sacrificing on the n+1 front.

but what if you're using that data inside of methods.

fat models, low memory

rails encourage you to stick logic inside of models. if you're doing that, then perhaps this code wasn't a raw sql query inside of the view but was instead nested in a method.

def approved_comment_count self.comments.approved.count end 

or maybe you need to do the math, maybe there is a critical threshold where pending comments overtake approved:

def comments_critical_threshold? self.comments.pending.count < self.comments.approved.count end 

this is trivial, but you could imagine a more complex case where logic is happening based on business rules. in this case, you don't want to have to duplicate the logic in your view (where we are using a hash) and in your model (where we are querying the database). instead, you can use dependency injection. which is the hyper-nerd way of saying we'll pass in values. we can change the method signature to something like this:

def comments_critical_threshold?(pending_count: comments.pending.count, approved_count: comments.approved.count) pending_count < approved_count end 

now i can call it and pass in values:

post.comments_critical_threshold?(pending_count: @pending_count_hash[post.id] || 0 , approved_count: @approved_count_hash[post.id] || 0 ) 

or, if you're using it somewhere else, you can use it without passing in values since we specified our default values for the keyword arguments.

btw, aren't keyword arguments great?

post.comments_critical_threshold? # default values are used here 

there are other ways to write the same code:

def comments_critical_threshold?(pending_count , approved_count ) pending_count ||= comments.pending.count approved_count ||= comments.approved.count pending_count < approved_count end 

you get the gist though -- pass values into your methods if you need to.

more than count

what if you're doing more than just counting? well, you can pull that data and group it in the same way by using select and specifying multiple fields. to keep going with our same example, maybe we want to show a truncated list of all commenter names and their avatar urls:

@comment_names_hash = comment.where(post_id: post_ids).select("names, avatar_url").group_by(&:post_ids) 

the results look like this:

1337: [ { name: "schneems", avatar_url: "https://http.cat/404.jpg" }, { name: "illegitimate45", avatar_url: "https://http.cat/451.jpg" } ] 

the 1337 is the post id, and then we get an entry with a name and an avatar_url for each comment. be careful here, though, as we're returning more data-- you still might not need all of it and making 2,000 hashes isn't much better than making 2,000 unused active record objects. you may want to better constrain your query with limits or by querying for more specific information.

are we there yet

at this point, we have gotten rid of our n+1 queries and we're hardly using any memory compared to before. yay! self-five. :partyparrot:.

this post is going to help save you money if you're running a rails server. it starts like this: you write an app. let's say you're building the next hyper-targeted blogging platform for medium length posts. when you login, you see a paginated list of all of the articles you've written. you have a post model and maybe for to do tags, you have a tag model, and for comments, you have a comment model. you write your view so that it renders the posts:

<% @posts.each do |post| %> <%= link_to(post, post.title) %> <%= teaser_for(post) %> <%= "#{post.comments.count} comments" <% end %> <%= pagination(@posts) %> 

see any problems with this? we have to make a single query to return all the posts - that's where the @posts comes from. say that there are n posts returned. in the code above, as the view iterates over each post, it has to calculate post.comments.count - but that in turn needs another database query. this is the n+1 query problem - our initial single query (the 1 in n+1) returns something (of size n) that we iterate over and perform yet another database query on (n of them).

introducing includes

if you've been around the rails track long enough you've probably run into the above scenario before. if you run a google search, the answer is very simple -- "use includes". the code looks like this:

# before @posts = current_user.posts.per_page(20).page(params[:page]) 

and after

@posts = current_user.posts.per_page(20).page(params[:page]) @posts = @posts.includes(:comments) 

this is still textbook, but let's look at what's going on. active record uses lazy querying so this won't actually get executed until we call @posts.first or @posts.all or @posts.each. when we do that two queries get executed, the first one for posts makes sense:

select * from posts where user_id=? limit ? offset ? 

active record will pass in user_id and limit and offset into the bind params and you'll get your array of posts.

note: we almost always want all queries to be scoped with a limit in production apps.

the next query you'll see may look something like this:

select * from comments where post_id in ? 

notice anything wrong? bonus points if you found it, and yes, it has something to do with memory.

if each of those 20 blog posts has 100 comments, then this query will return 2,000 rows from your database. active record doesn't know what data you need from each post comment, it'll just know it was told you'll eventually need them. so what does it do? it creates 2,000 active record objects in memory because that's what you told it to do. that's the problem, you don't need 2,000 objects in memory. you don't even need the objects, you only need the count.

the good: you got rid of your n+1 problem.

the bad: you're stuffing 2,000 (or more) objects from the database into memory when you aren't going to use them at all. this will slow down this action and balloon the memory use requirements of your app.

it's even worse if the data in the comments is large. for instance, maybe there is no max size for a comment field and people write thousand word essays, meaning we'll have to load those really large strings into memory and keep them there until the end of the request even though we're not using them.

n+1 is bad, unneeded memory allocation is worse

now we've got a problem. we could "fix" it by re-introducing our n+1 bug. that's a valid fix, however, you can easily benchmark it. use rack-mini-profiler in development on a page with a large amount of simulated data. sometimes it's faster to not "fix" your n+1 bugs.

that's not good enough for us, though -- we want no massive memory allocation spikes and no n+1 queries.

counter cache

what's the point of having cache if you can't count it? instead of having to call post.comments.count each time, which costs us a sql query, we can store that data directly inside of the post model. this way when we load a post object we automatically have this info. from the docs for the counter cache you'll see we need to change our model to something like this:

class comment < applicationrecord belongs_to :post , counter_cache: count_of_comments #… end 

now in our view, we can call:

 <%= "#{post.count_of_comments} comments" 

boom! now we have no n+1 query and no memory problems. but...

counter cache edge cases

you cannot use a counter cache with a condition. let's change our example for a minute. let's say each comment could either be "approved", meaning you moderated it and allow it to show on your page, or "pending". perhaps this is a vital piece of information and you must show it on your page. previously we would have done this:

 <%= "#{ post.comments.approved.count } approved comments" <%= "#{ post.comments.pending.count } pending comments" 

in this case the comment model has a status field and calling comments.pending is equivalent to adding where(status: "pending"). it would be great if we could have a post.count_of_pending_comments cache and a post.count_of_approved_comments cache, but we can't. there are some ways to hack it, but there are edge cases, and not all apps can safely accommodate for all edge cases. let's say ours is one of those.

now what? we could get around this with some view caching because if we cache your entire page, we only have to render it and pay that n+1 cost once. maybe fewer times if we are re-using view components and are using "russian doll" style view caches .

if view caching is out of the question due to <reasons>, what are we left with? we have to use our database the way the original settlers of the wild west did, manually and with great effort.

manually building count data in hashes

in our controller where we previously had this:

@posts = current_user.posts.per_page(20).page(params[:page]) @posts = @posts.includes(:comments) 

we can remove that includes and instead build two hashes. active record returns hashes when we use group(). in this case we know we want to associate comment count with each post, so we group by :post_id.

@posts = current_user.posts.per_page(20).page(params[:page]) post_ids = @posts.map(&:id) @pending_count_hash = comment.pending.where(post_id: post_ids).group(:post_id).count @approved_count_hash = comment.approved.where(post_id: post_ids).group(:post_id).count 

now we can stash and use this value in our view instead:

 <%= "#{ @approved_count_hash[post.id] || 0 } approved comments" <%= "#{ @pending_count_hash[post.id] || 0 } pending comments" 

now we have 3 queries, one to find our posts and one for each comment type we care about. this generates 2 extra hashes that hold the minimum of information that we need.

i've found this strategy to be super effective in mitigating memory issues while not sacrificing on the n+1 front.

but what if you're using that data inside of methods.

fat models, low memory

rails encourage you to stick logic inside of models. if you're doing that, then perhaps this code wasn't a raw sql query inside of the view but was instead nested in a method.

def approved_comment_count self.comments.approved.count end 

or maybe you need to do the math, maybe there is a critical threshold where pending comments overtake approved:

def comments_critical_threshold? self.comments.pending.count < self.comments.approved.count end 

this is trivial, but you could imagine a more complex case where logic is happening based on business rules. in this case, you don't want to have to duplicate the logic in your view (where we are using a hash) and in your model (where we are querying the database). instead, you can use dependency injection. which is the hyper-nerd way of saying we'll pass in values. we can change the method signature to something like this:

def comments_critical_threshold?(pending_count: comments.pending.count, approved_count: comments.approved.count) pending_count < approved_count end 

now i can call it and pass in values:

post.comments_critical_threshold?(pending_count: @pending_count_hash[post.id] || 0 , approved_count: @approved_count_hash[post.id] || 0 ) 

or, if you're using it somewhere else, you can use it without passing in values since we specified our default values for the keyword arguments.

btw, aren't keyword arguments great?

post.comments_critical_threshold? # default values are used here 

there are other ways to write the same code:

def comments_critical_threshold?(pending_count , approved_count ) pending_count ||= comments.pending.count approved_count ||= comments.approved.count pending_count < approved_count end 

you get the gist though -- pass values into your methods if you need to.

more than count

what if you're doing more than just counting? well, you can pull that data and group it in the same way by using select and specifying multiple fields. to keep going with our same example, maybe we want to show a truncated list of all commenter names and their avatar urls:

@comment_names_hash = comment.where(post_id: post_ids).select("names, avatar_url").group_by(&:post_ids) 

the results look like this:

1337: [ { name: "schneems", avatar_url: "https://http.cat/404.jpg" }, { name: "illegitimate45", avatar_url: "https://http.cat/451.jpg" } ] 

the 1337 is the post id, and then we get an entry with a name and an avatar_url for each comment. be careful here, though, as we're returning more data-- you still might not need all of it and making 2,000 hashes isn't much better than making 2,000 unused active record objects. you may want to better constrain your query with limits or by querying for more specific information.

are we there yet

at this point, we have gotten rid of our n+1 queries and we're hardly using any memory compared to before. yay! self-five. :partyparrot:.

this post is going to help save you money if you're running a rails server. it starts like this: you write an app. let's say you're building the next hyper-targeted blogging platform for medium length posts. when you login, you see a paginated list of all of the articles you've written. you have a post model and maybe for to do tags, you have a tag model, and for comments, you have a comment model. you write your view so that it renders the posts:

<% @posts.each do |post| %> <%= link_to(post, post.title) %> <%= teaser_for(post) %> <%= "#{post.comments.count} comments" <% end %> <%= pagination(@posts) %> 

see any problems with this? we have to make a single query to return all the posts - that's where the @posts comes from. say that there are n posts returned. in the code above, as the view iterates over each post, it has to calculate post.comments.count - but that in turn needs another database query. this is the n+1 query problem - our initial single query (the 1 in n+1) returns something (of size n) that we iterate over and perform yet another database query on (n of them).

introducing includes

if you've been around the rails track long enough you've probably run into the above scenario before. if you run a google search, the answer is very simple -- "use includes". the code looks like this:

# before @posts = current_user.posts.per_page(20).page(params[:page]) 

and after

@posts = current_user.posts.per_page(20).page(params[:page]) @posts = @posts.includes(:comments) 

this is still textbook, but let's look at what's going on. active record uses lazy querying so this won't actually get executed until we call @posts.first or @posts.all or @posts.each. when we do that two queries get executed, the first one for posts makes sense:

select * from posts where user_id=? limit ? offset ? 

active record will pass in user_id and limit and offset into the bind params and you'll get your array of posts.

note: we almost always want all queries to be scoped with a limit in production apps.

the next query you'll see may look something like this:

select * from comments where post_id in ? 

notice anything wrong? bonus points if you found it, and yes, it has something to do with memory.

if each of those 20 blog posts has 100 comments, then this query will return 2,000 rows from your database. active record doesn't know what data you need from each post comment, it'll just know it was told you'll eventually need them. so what does it do? it creates 2,000 active record objects in memory because that's what you told it to do. that's the problem, you don't need 2,000 objects in memory. you don't even need the objects, you only need the count.

the good: you got rid of your n+1 problem.

the bad: you're stuffing 2,000 (or more) objects from the database into memory when you aren't going to use them at all. this will slow down this action and balloon the memory use requirements of your app.

it's even worse if the data in the comments is large. for instance, maybe there is no max size for a comment field and people write thousand word essays, meaning we'll have to load those really large strings into memory and keep them there until the end of the request even though we're not using them.

n+1 is bad, unneeded memory allocation is worse

now we've got a problem. we could "fix" it by re-introducing our n+1 bug. that's a valid fix, however, you can easily benchmark it. use rack-mini-profiler in development on a page with a large amount of simulated data. sometimes it's faster to not "fix" your n+1 bugs.

that's not good enough for us, though -- we want no massive memory allocation spikes and no n+1 queries.

counter cache

what's the point of having cache if you can't count it? instead of having to call post.comments.count each time, which costs us a sql query, we can store that data directly inside of the post model. this way when we load a post object we automatically have this info. from the docs for the counter cache you'll see we need to change our model to something like this:

class comment < applicationrecord belongs_to :post , counter_cache: count_of_comments #… end 

now in our view, we can call:

 <%= "#{post.count_of_comments} comments" %> 

boom! now we have no n+1 query and no memory problems. but...

counter cache edge cases

you cannot use a counter cache with a condition. let's change our example for a minute. let's say each comment could either be "approved", meaning you moderated it and allow it to show on your page, or "pending". perhaps this is a vital piece of information and you must show it on your page. previously we would have done this:

 <%= "#{ post.comments.approved.count } approved comments" %> <%= "#{ post.comments.pending.count } pending comments" %> 

in this case the comment model has a status field and calling comments.pending is equivalent to adding where(status: "pending"). it would be great if we could have a post.count_of_pending_comments cache and a post.count_of_approved_comments cache, but we can't. there are some ways to hack it, but there are edge cases, and not all apps can safely accommodate for all edge cases. let's say ours is one of those.

now what? we could get around this with some view caching because if we cache your entire page, we only have to render it and pay that n+1 cost once. maybe fewer times if we are re-using view components and are using "russian doll" style view caches .

if view caching is out of the question due to <reasons>, what are we left with? we have to use our database the way the original settlers of the wild west did, manually and with great effort.

manually building count data in hashes

in our controller where we previously had this:

@posts = current_user.posts.per_page(20).page(params[:page]) @posts = @posts.includes(:comments) 

we can remove that includes and instead build two hashes. active record returns hashes when we use group(). in this case we know we want to associate comment count with each post, so we group by :post_id.

@posts = current_user.posts.per_page(20).page(params[:page]) post_ids = @posts.map(&:id) @pending_count_hash = comment.pending.where(post_id: post_ids).group(:post_id).count @approved_count_hash = comment.approved.where(post_id: post_ids).group(:post_id).count 

now we can stash and use this value in our view instead:

 <%= "#{ @approved_count_hash[post.id] || 0 } approved comments" %> <%= "#{ @pending_count_hash[post.id] || 0 } pending comments" %> 

now we have 3 queries, one to find our posts and one for each comment type we care about. this generates 2 extra hashes that hold the minimum of information that we need.

i've found this strategy to be super effective in mitigating memory issues while not sacrificing on the n+1 front.

but what if you're using that data inside of methods.

fat models, low memory

rails encourage you to stick logic inside of models. if you're doing that, then perhaps this code wasn't a raw sql query inside of the view but was instead nested in a method.

def approved_comment_count self.comments.approved.count end 

or maybe you need to do the math, maybe there is a critical threshold where pending comments overtake approved:

def comments_critical_threshold? self.comments.pending.count < self.comments.approved.count end 

this is trivial, but you could imagine a more complex case where logic is happening based on business rules. in this case, you don't want to have to duplicate the logic in your view (where we are using a hash) and in your model (where we are querying the database). instead, you can use dependency injection. which is the hyper-nerd way of saying we'll pass in values. we can change the method signature to something like this:

def comments_critical_threshold?(pending_count: comments.pending.count, approved_count: comments.approved.count) pending_count < approved_count end 

now i can call it and pass in values:

post.comments_critical_threshold?(pending_count: @pending_count_hash[post.id] || 0 , approved_count: @approved_count_hash[post.id] || 0 ) 

or, if you're using it somewhere else, you can use it without passing in values since we specified our default values for the keyword arguments.

btw, aren't keyword arguments great?

post.comments_critical_threshold? # default values are used here 

there are other ways to write the same code:

def comments_critical_threshold?(pending_count , approved_count ) pending_count ||= comments.pending.count approved_count ||= comments.approved.count pending_count < approved_count end 

you get the gist though -- pass values into your methods if you need to.

more than count

what if you're doing more than just counting? well, you can pull that data and group it in the same way by using select and specifying multiple fields. to keep going with our same example, maybe we want to show a truncated list of all commenter names and their avatar urls:

@comment_names_hash = comment.where(post_id: post_ids).select("names, avatar_url").group_by(&:post_ids) 

the results look like this:

1337: [ { name: "schneems", avatar_url: "https://http.cat/404.jpg" }, { name: "illegitimate45", avatar_url: "https://http.cat/451.jpg" } ] 

the 1337 is the post id, and then we get an entry with a name and an avatar_url for each comment. be careful here, though, as we're returning more data-- you still might not need all of it and making 2,000 hashes isn't much better than making 2,000 unused active record objects. you may want to better constrain your query with limits or by querying for more specific information.

are we there yet

at this point, we have gotten rid of our n+1 queries and we're hardly using any memory compared to before. yay! self-five. :partyparrot:.

your heroku applications run on top of a curated stack, containing the operating system and other components needed at runtime. we maintain the stack - updating the os, the libraries, and ensuring that known security issues are resolved, so that you can focus on writing code.

today we're announcing the general availability of heroku-16, our curated stack based on ubuntu 16.04 lts. in addition to a new base operating system, heroku-16 is updated with the latest libraries. if you’re a ruby or python developer, heroku-16 includes 15% more development headers at build time, making it easier to compile native packages on heroku. finally, heroku-16 offers a better local development experience when using docker, because of its smaller image size.

since its beta in march, heroku-16 has been tested on thousands of applications and is now ready for production on both common runtime and private spaces apps. heroku-16 will become the stack new applications use (i.e., the default stack) on may 8th, 2017. to learn more about testing and upgrading your app, check out the heroku-16 documentation.

what's new

smaller docker image

with the release of heroku-16, we’ve changed the architecture of the stack, allowing us to provide you with a curated ubuntu 16-based docker image at 465 mb (vs 1.35 gb for cedar-14).

to use heroku-16, specify it as your base image in your dockerfile:

from heroku/heroku:16 

by using the heroku-16 docker image for local development, you ensure the stack running locally is the same stack running on heroku (i.e., dev/prod parity). everyone -- heroku customer or not -- is free to use the heroku-16 docker image.

improved support for compiling native ruby and python packages

at build time heroku-16 includes 15% more development headers than cedar-14. this means fewer failed builds when your app needs to compile native ruby or python packages.

updated stack libraries

heroku-16 should largely be backwards compatible with cedar-14. we have, however, removed lesser used packages to reduce the security surface area and stack image size. apps may also encounter incompatibilities because libraries on heroku-16 have been updated to their most recent versions. learn more about the packages installed in cedar-14 and heroku-16.

how to test and upgrade

testing heroku-16 with your application, especially if you use review apps, is easy. simply define your stack in app.json and create a new pull request:

{ "stack": "heroku-16" } 

if your tests are successful, you can upgrade your application:

$ heroku stack:set heroku-16 -a example-app … $ git commit -m "upgrade to heroku-16" --allow-empty … $ git push heroku master 

for more information on upgrading your app, check out the heroku-16 documentation.

stack support

heroku-16 is now generally available and we recommend you use it for new apps. heroku-16 will be supported through april 2021, when long term support (lts) of ubuntu 16.04 ends. cedar-14, the previous version of our stack, will continue to be supported through april 2019. for more information, check out our stack update policy.


Information

how we built heroku ci: our product intuition checked against what the market wants (we surveyed ~1000 developers to figure out the latter, and the results were surprising)

two approaches to building any product are often in tension: designing from inspiration, and designing from information. on the pure inspiration side, you just build the product you dream of, and trust that it will be so awesome and useful, that it will succeed in the market. on the pure information side, you build exactly what the market is asking for, as best you can tell (think: surveys, top customer feature requests, consultants, customer panels).

our initial design for heroku ci (currently in public beta) was nearly pure inspiration, in large part from our engineering staff. we had a good idea of our dream ci product, and many of the raw materials to build it (heroku apps, our build process, heroku add-ons, etc.). and heroku, like the rest of salesforce, strongly encourages such experimentation and innovation when individual product teams are so inspired. we create and improve a lot of amazing products that way.

heroku ci: design by inspiration

the heroku devex (developer experience) team is distributed across two continents and four countries. we need well-structured, team-oriented developer workflow as much as our users and customers, including, of course, running tests for continuous integration (ci). and we are a pretty experienced team of developers, product, and design. we spend a lot of time with customers, and, of course, live and breathe in today's app dev and devops world.

so the heroku ci design began with our own well-formed and granular ideas on what our customers wanted out of ci, and we quickly arrived at what turned out to be a pretty stable initial design.

our alpha launch had heroku ci integrating with, and extending, heroku's continuous delivery (cd) feature set: heroku pipelines, review apps, and heroku github deploys. the initial feature set and developer experience were inspired largely from our own intuitions, experiences, and conceptions of customer need.

heroku ci: ci design by information

once we had a limited-functionality alpha version (mvp of our pre-conceived design for heroku ci), we tested it over several weeks internally, then invited heroku users via twitter and direct e-mail to try it out.

for users to access this "limited public beta" ci feature, they had to complete a survey on how they use ci and what they want from ci. we wanted to determine how they use (and hope to use) ci, and also wanted to make sure those interested could actually use the new heroku ci alpha -- e.g. that we supported the language they required.

as it turned out, so many users responded -- about 1000 for most questions -- that our short survey has become the largest of cloud paas developers on ci in recent years. the data helped us shape the planned ci feature set for public release. we also think the survey results will help the dev tools community understand how developers want to use ci, and so we are making the full results available in this post.

important-ci-chart

the survey covers customers' responses on how they want to use ci, and how it fits into their cd and team workflows.

you can read elsewhere about using heroku ci. and you can just try it out yourself: the feature is in public beta and available by default to all heroku users (just check the heroku pipelines web interface). right now let's talk about how we decided what to build.

some customer concerns initially seemed obvious, but we asked anyway: of course we found that ci users want shorter queue times, faster test runs. some features were not as predictable. and some of what's been most successful in our private beta was not requested by surveyed or otherwise vocal users at all.

how companies build successful devops products

almost a year ago, at the heroku devex team offsite, we met up one morning at a london coffee house, the café at the hoxton, shoreditch. 77915323_shoreditch an informal conversation on what features/products to build next turned to ci. there had been customer requests to be sure. and it seemed to us like ci would be a great addition to our heroku flow continuous delivery features. most of all, we just wanted to build it. ci seemed to us so compelling for our users as an integrated feature, and so compelling to us as a very cool thing to build, that there was soon consensus on the team. i'll note here that even at a biggish company like heroku our team excitement around building cool stuff that users will love has a lot of influence over what we build. we didn't do a ton of financial analysis, or call our big-firm analyst. we, the developer experience team, wanted to build ci; there was roughly adequate customer and strategic reasons to do it, and we had the collective excitement and motivation to give it a shot. personally, it’s gratifying to me that software can be built that way at heroku and salesforce.

the hoxton café, by the way, is a fabulous place. you can stay there all day, and we did, occupying valuable retail space and working out what heroku ci might look like: ux, features, integrations, and above all, how the feature will live up to the heroku promise of being (all at one now) simple, prescriptive, and powerful. thankfully they serve breakfast, lunch, and a very nice dinner.

on the rise of devops

interest-over-time-devops

interest-over-time-software-development

ci as a practice in software development has been growing relentlessly over the past decade (even as interest in software development as a holistic discipline has stagnated). "software development" is increasingly many separate disciplines, each with its own constituency and community -- the devops segment alone is projected to be a nearly $3 billion market in 2017. and developer focus on devops has given rise to many great products in the ci market. we wanted to understand how we could help developers have even better ci experiences, get better productivity, and build better products of their own.

our spike on the idea of a "heroku ci" was essentially a simple test runner (for ruby and node) with a nice-looking ui integrated into the heroku pipelines interface. this spike, plus a nice ui and some basic supporting features, constituted our the version we released to alpha. being heroku we had a lot of the parts we needed to assemble test environments (ephemeral apps, dynos, add-ons, a build system, …) just lying around, and lots of co-workers willing to try out what we built.

quite quickly, we had an initial ci test-runner bolted onto the existing pipelines feature. we found it useful to us internally and we felt comfortable inviting a few dozen friends-of-heroku to use it and give feedback. within a few more weeks, and a few dozen iterations, we were ready to ask a broader audience to weigh in (and take the survey).

on the rise of pre-integrated devtools offerings?

yes, ci is integrated with heroku pipelines, and in fact they share a ux. while we could separate these in the future, the integrated offering is proving popular with users from big enterprise to small teams. we think this is in large part because it’s integrated.

there was a time when "best of breed" was the catchphrase in devops. you spun together vmware, chef, puppet, jenkins, git, etc. so why are integrated offerings popular now? our thinking is that ci/cd go together like chocolate and peanut butter: it’s a bit messy to put together yourself, but great when someone packages it for you. we noticed that our users, in general, enjoyed using ci products that set-up and integrated easily with heroku pipelines (great products like travis and circle). the popularity of fully integrated ci offerings wasn't lost on us either (look at the popularity of gitlab's integrated ci, or atlassian's increasingly integrated devtools products). among other advantages, you get single auth, add-ons work automatically, and there's no context switching from product to product in use or administration.

the consumerization of devops

as importantly, we've noticed developers are demanding better ux for dev tools in general, and ci in particular (think of the delightful interface of snap ci and others). we also noticed the pain many users described in setting up jenkins, which leads the ci industry, but comes with a dizzying array of complex options, plug-ins (1500+), and a typically long set-up time, and maintenance labor. an extreme but real example here: one large heroku customer needs to open a ticket with its it department to provision each new jenkins instance, often adding two months to the start-up of a new project if they want to use ci. this company is now using integrated heroku ci on some projects, reducing ci setup time from months to minutes.

the customer survey results (and how we applied them)

preferred developer stack for heroku developers

developers, statistically, are centering around a few preferred workflow stacks. github and slack are the dominant vcs and team chat[ops] tools by most current market measures -- strong integration with these seemed necessary for us, and for any devops product. atlassian's bitbucket, hipchat are each a solid second in these roles. together with trello this would seem to give atlassian a significant enough center of gravity (especially at larger companies) to also require good integration. and gitlab is surging in vcs and ci (and in developer buzz), at a curve that seems poised to challenge much longer-standing products. being part of these popular toolchains is important to us, and where there is overlap in features, developers can choose where they want to live on the way to deployment and delivery.

note that, as heroku is a cloud platform, our user base tends to prefer cloud products. we know also that a significant proportion of our customers who use "on premise" version control options like github enterprise and bitbucket server are hosting them in the cloud, often on aws. so even a portion of these on-prem components are running as a self-managed cloud-like service.

what-vcs

simple integrated solutions over best-of-breed complexity

jenkins is the leader in ci market share (around 70% by most analyst estimates). and its usage is growing at a pretty good clip. the vast majority of jenkins installations are on-premise (about 83%) our customers are always in the cloud, deploying to either heroku's common runtime -- our traditional secure public cloud heroku -- or to heroku private spaces, our virtual private paas cloud offering.

so for the cloud customers, as in our survey, the ci product usage sizes out a bit differently:

what-ci-tools

i noted earlier the popularity of gitlab's integrated ci being significant to us in our decision to integrate ci. gitlab's been really good at moving fast in bringing popular, highly integrated features to market, building on its popular vcs. note here that gitlab ci it is clearly the biggest mover in activity on stack overflow among popular cloud ci solutions.

ci-proviers

but all these cloud ci solutions are still dwarfed by jenkins (as noted above, only 17% of jenkins installs are in the cloud).

ci-providers-jenkins

interestingly, jenkins’ lovely new "blue ocean" ux, courtesy of cloudbees, seems to underscore the growing importance of simple, prescriptive developer experience in modern ci/cd solutions -- something that has always been, and is still, job 1 at heroku.

speed matters - a lot

you'll notice in the first chart this post that fast queue time and fast execution time are the top most important features for surveyed users other than price.

we have eliminated queue time altogether with heroku ci. a new ephemeral ci environment (oh: it’s a heroku app) is created at the beginning of each test run, and then destroyed when tests are complete. while there is some nominal setup time here, it’s nothing like traditional queue time waiting for a free ci worker.

environment parity: close enough is good enough for most

note that at the chart at the top of the post, only about 40% of respondents found production environment parity to be "important". when we initially conceived of heroku ci, we thought one of the biggest selling points would be environment parity between test runs environments (apps) and staging and production apps (environments). to our surprise, users did not care as much as we thought they should.

this fact led us to an innovation around add-ons for ephemeral environments like ci and review apps. we now inform add-on partners that their add-on is being provisioned for an ephemeral heroku app (a ci app in this case). during provisioning, the add-on vendors can choose to eliminate slow/costly features of the add-on that might take a long time to spin up, but are usually unnecessary for apps that will exist for such a short time – like long-term logging, billing features, etc.

in this way, we are working across the ecosystem of cloud-based services that developers use to make sure each component is automatically configured for ci.

test-dependencies

we did make sure that you can customize ci apps in the environments section of the newly revised heroku app.json manifest. customizability, when we can do it in a way that doesn't complicate, is as important as being prescriptive.

and some most requested features we under-predicted

re-run-tests

re-running tests was not part of initial ci alpha, and we were somewhat surprised by the high number of people who requested it both in the initial survey and among alpha users. so there’s now a big "run again" button on the ui, and we do find this feature frequently used in the current public beta.

uat-web-uis

browser testing – often called user acceptance testing or uat – was popular with surveyed users, so it was moved up in our list of features planned for launch. uat in a browser is also required by many salesforce developers, whose applications are often web-deployed and/or exist within the salesforce interface. (note that heroku ci will also be used by salesforce developers using apex, hybrid-language apps, or frameworks like lightning.)

the takeaway

product design by inspiration is exhilarating, and it's great to see a product or feature that arises from the sheer excitement of the team to build it succeeding with users. verifying that our intuition was right with users takes a lot of time, but it's well worth it. in short what worked for us was to trust our instincts, but verify with users (and adjust).

most importantly, we view the developer community as a diverse ecosystem of innovators, thought leaders, vendors, and open source efforts, among others. it's a pleasure to share what we know, learn, and create with our vibrant, diverse community.

let us know what you think of heroku ci, and what you want in your ci solutions, at [email protected]


Information

back on august 11, 2016, heroku experienced increased routing latency in the eu region of the common runtime. while the official follow-up report describes what happened and what we've done to avoid this in the future, we found the root cause to be puzzling enough to require a deep dive into linux networking.

the following is a write-up by sre member lex neva (what's sre?) and routing engineer fred hebert (now heroku alumni) of an interesting linux networking "gotcha" they discovered while working on incident 930.

the incident

our monitoring systems paged us about a rise in latency levels across the board in the eu region of the common runtime. we quickly saw that the usual causes didn’t apply: cpu usage was normal, packet rates were entirely fine, memory usage was green as a summer field, request rates were low, and socket usage was well within the acceptable range. in fact, when we compared the eu nodes to their us counterparts, all metrics were at a nicer level than the us ones, except for latency. how to explain this?

one of our engineers noticed that connections from the routing layer to dynos were getting the posix error code eaddrinuse, which is odd.

for a server socket created with listen(), eaddrinuse indicates that the port specified is already in use. but we weren’t talking about a server socket; this was the routing layer acting as a client, connecting to dynos to forward an http request to them. why would we be seeing eaddrinuse?

tcp/ip connections

before we get to the answer, we need a little bit of review about how tcp works.

let’s say we have a program that wants to connect to some remote host and port over tcp. it will tell the kernel to open the connection, and the kernel will choose a source port to connect from. that’s because every ip connection is uniquely specified by a set of 4 pieces of data:

( <source-ip> : <source-port> , <destination-ip> : <destination-port> ) 

no two connections can share this same set of 4 items (called the “4-tuple”). this means that any given host (<source-ip>) can only connect to any given destination (<destination-ip>:<destination-port>) at most 65536 times concurrently, which is the total number of possible values for <source-port>. importantly, it’s okay for two connections to use the same source port, provided that they are connecting to a different destination ip and/or port.

usually a program will ask linux (or any other os) to automatically choose an available source port to satisfy the rules. if no port is available (because 65536 connections to the given destination (<destination-ip>:<destination-port>) are already open), then the os will respond with eaddrinuse.

this is a little complicated by a feature of tcp called “time_wait”. when a given connection is closed, the tcp specification declares that both ends should wait a certain amount of time before opening a new connection with the same 4-tuple. this is to avoid the possibility that delayed packets from the first connection might be misconstrued as belonging to the second connection.

generally this time_wait waiting period lasts for only a minute or two. in practice, this means that even if 65536 connections are not currently open to a given destination ip and port, if enough recent connections were open, there still may not be a source port available for use in a new connection. in practice even fewer concurrent connections may be possible since linux tries to select source ports randomly until it finds an available one, and with enough source ports used up, it may not find a free one before it gives up.

port exhaustion in heroku’s routing layer

so why would we see eaddrinuse in connections from the routing layer to dynos? according to our understanding, such an error should not happen. it would indicate that 65536 connections from a specific routing node were being made to a specific dyno. this should mean that the theoretical limit on concurrent connections should be far more than a single dyno could ever hope to handle.

we could easily see from our application traffic graphs that no dyno was coming close to this theoretical limit. so we were left with a concerning mystery: how was it possible that we were seeing eaddrinuse errors?

we wanted to prevent the incident from ever happening again, and so we continued to dig - taking a dive into the internals of our systems.

our routing layer is written in erlang, and the most likely candidate was its virtual machine’s tcp calls. digging through the vm’s network layer we got down to the sock_connect call which is mostly a portable wrapper around the linux connect() syscall.

seeing this, it seemed that nothing in there was out of place to cause the issue. we’d have to go deeper, in the os itself.

after digging and reading many documents, one of us noticed this bit in the now well-known blog post bind before connect:

bind is usually called for listening sockets so the kernel needs to make sure that the source address is not shared with anyone else. it's a problem. when using this techique [sic] in this form it's impossible to establish more than 64k (ephemeral port range) outgoing connections in total. after that the attempt to call bind() will fail with an eaddrinuse error - all the source ports will be busy.

[...]

when we call bind() the kernel knows only the source address we're asking for. we'll inform the kernel of a destination address only when we call connect() later.

this passage seems to be describing a special case where a client wants to make an outgoing connection with a specific source ip address. we weren’t doing that in our erlang code, so this still didn’t seem to fit our situation well. but the symptoms matched so well that we decided to check for sure whether the erlang vm was doing a bind() call without our knowledge.

we used strace to determine the actual system call sequence being performed. here’s a snippet of strace output for a connection to 10.11.12.13:80:

socket(pf_inet, sock_stream, ipproto_tcp) = 3 *bind*(3, {sa_family=af_inet, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 connect(3, {sa_family=af_inet, sin_port=htons(80), sin_addr=inet_addr("10.11.12.13")}, 16) = 0 

to our surprise, bind() was being called! the socket was being bound to a <source-ip>:<source-port> of 0.0.0.0:0. why?

this instructs the kernel to bind the socket to any ip and any port. this seemed a bit useless to us, as the kernel would already select an appropriate <source-ip> when connect() was called, based on the destination ip address and the routing table.

this bind() call seemed like a no-op. but critically, this call required the kernel to select the <source-ip> right then and there, without having any knowledge of the other 3 parts of the 4-tuple: <source-ip>, <destination-ip>, and <destination-port>. the kernel would therefore have only 65536 possible choices and might return eaddrinuse, as per the bind() manpage:

eaddrinuse (internet domain sockets) the port number was specified as zero in the socket address structure, but, upon attempting to bind to an ephemeral port, it was determined that all port numbers in the ephemeral port range are currently in use. see the discussion of /proc/sys/net/ipv4/ip_local_port_range ip(7).

unbeknownst to us, we had been operating for a very long time with far lower of a tolerance threshold than expected -- the ephemeral port range was effectively a limit to how much traffic we could tolerate per routing layer instance, while we thought no such limitation existed.

the fix

reading further in bind before connect yields the fix: just set the so_reuseaddr socket option before the bind() call. in erlang this is done by simply passing {reuseaddr, true}.

at this point we thought we had our answer, but we had to be sure. we decided to test it.

we first wrote a small c program that exercised the current limit:

#include <sys/types.h> #include <sys/socket.h> #include <stdio.h> #include <string.h> #include <arpa/inet.h> #include <unistd.h> int main(int argc, char **argv) { /* usage: ./connect_with_bind <num> <dest1> <dest2> ... <destn> * * opens <num> connections to port 80, round-robining between the specified * destination ips. then it opens the same number of connections to port * 443. */ int i; int fds[131072]; struct sockaddr_in sin; struct sockaddr_in dest; memset(&sin, 0, sizeof(struct sockaddr_in)); sin.sin_family = af_inet; sin.sin_port = htons(0); // source port 0 (kernel picks one) sin.sin_addr.s_addr = htonl(inaddr_any); // source ip 0.0.0.0 for (i = 0; i < atoi(argv[1]); i++) { memset(&dest, 0, sizeof(struct sockaddr_in)); dest.sin_family = af_inet; dest.sin_port = htons(80); // round-robin between the destination ips specified dest.sin_addr.s_addr = inet_addr(argv[2 + i % (argc - 2)]); fds[i] = socket(af_inet, sock_stream, ipproto_tcp); bind(fds[i], (struct sockaddr *)&sin, sizeof(struct sockaddr_in)); connect(fds[i], (struct sockaddr *)&dest, sizeof(struct sockaddr_in)); } sleep(5); fprintf(stderr, "going to start connecting to port 443\n"); for (i = 0; i < atoi(argv[1]); i++) { memset(&dest, 0, sizeof(struct sockaddr_in)); dest.sin_family = af_inet; dest.sin_port = htons(443); dest.sin_addr.s_addr = inet_addr(argv[2 + i % (argc - 2)]); fds[i] = socket(af_inet, sock_stream, ipproto_tcp); bind(fds[i], (struct sockaddr *)&sin, sizeof(struct sockaddr_in)); connect(fds[i], (struct sockaddr *)&dest, sizeof(struct sockaddr_in)); } sleep(5); } 

we increased our file descriptor limit and ran this program as follows:

./connect_with_bind 65536 10.11.12.13 10.11.12.14 10.11.12.15

this program attempted to open 65536 connections to port 80 on the three ips specified. then it attempted to open another 65536 connections to port 443 on the same ips. if only the 4-tuple were in play, we should be able to open all of these connections without any problem.

we ran the program under strace while monitoring ss -s for connection counts. as expected, we began seeing eaddrinuse errors from bind(). in fact, we saw these errors even before we’d opened 65536 connections. the linux kernel does source port allocation by randomly selecting a candidate port and then checking the n following ports until it finds an available port. this is an optimization to prevent it from having to scan all 65536 possible ports for each connection.

once that baseline was established, we added the so_reuseaddr socket option. here are the changes we made:

--- connect_with_bind.c 2016-12-22 10:29:45.916723406 -0500 +++ connect_with_bind_and_reuse.c 2016-12-22 10:31:54.452322757 -0500 @@ -17,6 +17,7 @@ int fds[131072]; struct sockaddr_in sin; struct sockaddr_in dest; + int one = 1; memset(&sin, 0, sizeof(struct sockaddr_in)); @@ -33,6 +34,7 @@ dest.sin_addr.s_addr = inet_addr(argv[2 + i % (argc - 2)]); fds[i] = socket(af_inet, sock_stream, ipproto_tcp); + setsockopt(fds[i], sol_socket, so_reuseaddr, &one, sizeof(int)); bind(fds[i], (struct sockaddr *)&sin, sizeof(struct sockaddr_in)); connect(fds[i], (struct sockaddr *)&dest, sizeof(struct sockaddr_in)); } @@ -48,6 +50,7 @@ dest.sin_addr.s_addr = inet_addr(argv[2 + i % (argc - 2)]); fds[i] = socket(af_inet, sock_stream, ipproto_tcp); + setsockopt(fds[i], sol_socket, so_reuseaddr, &one, sizeof(int)); bind(fds[i], (struct sockaddr *)&sin, sizeof(struct sockaddr_in)); connect(fds[i], (struct sockaddr *)&dest, sizeof(struct sockaddr_in)); } 

we ran it like this:

./connect_with_bind_and_reuse 65536 10.11.12.13 10.11.12.14 10.11.12.15

our expectation was that bind() would stop returning eaddrinuse. the new program confirmed this fairly rapidly, and showed us once more that what you may expect from theory and practice has quite a gap to be bridged.

knowing this, all we had to do is confirm that the {reuseaddr, true} option for the erlang side would work, and a quick strace of a node performing the call confirmed that the appropriate setsockopt() call was being made.

giving back

it was quite an eye-opening experience to discover this unexpected connection limitation in our routing layer. the patch to vegur, our open-sourced http proxy library, was deployed a couple of days later, preventing this issue from ever biting us again.

we hope that sharing our experience here, we might save you from similar bugs in your systems.


Information

as part of our commitment to security and support, we periodically upgrade the stack image, so that we can install updated package versions, address security vulnerabilities, and add new packages to the stack. recently we had an incident during which some applications running on the cedar-14 stack image experienced higher than normal rates of segmentation faults and other “hard” crashes for about five hours. our engineers tracked down the cause of the error to corrupted dyno filesystems caused by a failed stack upgrade. the sequence of events leading up to this failure, and the technical details of the failure, are unique, and worth exploring.

background

heroku runs application processes in dynos, which are lightweight linux containers, each with its own, isolated filesystem. our runtime system composes the container’s filesystem from a number of mount points. two of these mount points are particularly critical: the /app mount point, which contains a read-write copy of the application, and the / mount point, which contains the container’s stack image, a prepared filesystem with a complete ubuntu installation. the stack image provides applications running on heroku dynos with a familiar linux environment and a predictable list of native packages. critically, the stack image is mounted read-only, so that we can safely reuse it for every dyno running on the same stack on the same host.

heroku filesystem diagram

given the large number of customer dynos we host, the stack upgrade process is almost entirely automated, and it’s designed so that a new stack image can be deployed without interfering with running dynos so that our users aren’t exposed to downtime on our behalf. we perform this live upgrade by downloading a disk image of the stack to each dyno host and then reconfiguring each host so that newly-started dynos will use the new image. we write the newly-downloaded image directly to the data directory our runtime tools use to find images to mount, so that we have safety checks in the deployment process, based on checksum files, to automatically and safely skip the download if the image is already present on the host.

root causes

near the start of december, we upgraded our container tools. this included changing the digest algorithms and filenames used by these safety checks. we also introduced a latent bug: the new version of our container tools didn't consider the checksum files produced by previous versions. they would happily install any disk image, even one that was already present, as long as the image had not yet been installed under the new tools.

we don’t often re-deploy an existing version of a stack image, so this defect might have gone unnoticed and would eventually have become irrelevant. we rotate hosts out of our runtime fleet and replace them with fresh hosts constantly, and the initial setup of a fresh host downloads the stack image using the same tools we use to roll out upgrades, which would have protected those hosts from the defect. unfortunately, this defect coincided with a second, unrelated problem. several days after the container tools upgrade, one of our engineers attempted to roll out an upgrade to the stack image. issues during this upgrade meant that we had to abort the upgrade, and our standard procedure to ensure that all container hosts are running the same version when we abort an upgrade involves redeploying the original version of the container.

during redeployment, the safety check preventing our tools from overwriting existing images failed, and our container tools truncated and overwrote the disk image file while it was still mounted in running dynos as the / filesystem.

technical impact

the linux kernel expects that a given volume, whether it’s backed by a disk or a file, will go through the filesystem abstraction whenever the volume is mounted. reads and writes that go through the filesystem are cached for future accesses, and the kernel enforces consistency guarantees like “creating a file is an atomic operation” through those apis. writing directly to the volume bypasses all of these mechanisms, completely, and (in true unix fashion) the kernel is more than happy to let you do it.

during the incident, the most relevant consequence for heroku apps involved the filesystem cache: by truncating the disk image, we’d accidentally ensured that reads from the image would return no data, while reads through the filesystem cache would return the data from the previously-present filesystem image. there’s very little predictability to which pages will be in the filesystem cache, so the most common effect on applications was that newly-loaded programs would partially load from the cache and partially load from the underlying disk image, mid-download. the resulting corrupted programs crashed, often with a segmentation fault, the first time they executed an instruction that attempted to read any of the missing data, or the first time they executed an instruction that had, itself, been damaged.

during the incident, our response lead put together a small example to verify the effects we’re seeing. if you have a virtual machine handy, you can reproduce the problem yourself, without all of our container infrastructure. (unfortunately, a docker container won’t cut it: you need something that can create new mount points.)

  1. create a disk image with a simple program on it. we used sleep.

    dd if=/dev/zero of=demo.img bs=1024 count=10240 mkfs -f -t ext4 demo.img sudo mkdir -p /mnt/demo sudo mount -o loop demo.img /mnt/demo sudo cp -a /bin/sleep /mnt/demo/sleep sudo umount /mnt/demo 
  2. make a copy of the image, which we’ll use later to simulate downloading the image:

    cp -a demo.img backup.img 
  3. mount the original image, as a read-only filesystem:

    sudo mount -o loop,ro demo.img /mnt/demo 
  4. in one terminal, start running the test program in a loop:

    while /mnt/demo/sleep 1; do : done 
  5. in a second terminal, replace the disk image out from underneath the program by truncating and rewriting it from the backup copy:

    while cat backup.img > demo.img; do # flush filesystem caches so that pages are re-read echo 3 | sudo tee /proc/sys/vm/drop_caches > /dev/null done 

reliably, sleep will crash, with segmentation fault (core dumped). this is exactly the error that affected customer applications.

this problem caught us completely by surprise. while we had taken into account that overwriting a mounted image would cause problems, none of us fully understood what those problems would be. while both our monitoring systems and our internal userbase alerted us to the problem quickly, neither was able to offer much insight into the root cause. application crashes are part of the normal state of our platform, and while an increase in crashes is a warning sign we take seriously, it doesn’t correlate with any specific causes. we were also hampered by our belief that our deployment process for stack image upgrades was designed not to modify existing filesystem images.

the fix

once we identified the problem, we migrated all affected dynos to fresh hosts, with non-corrupted filesystems and with coherent filesystem caches. this work took the majority of the five hours during which the incident was open.

in response to this incident, we now mark filesystem images as read-only on the host filesystem once they’re installed. we’ve re-tested this, under the conditions that lead to the original incident, and we’re confident that this will prevent this and any overwriting-related problems in the future.

we care deeply about managing security, platform maintenance and other container orchestration tasks, so that your apps "just work" - and we're confident that these changes make our stack management even more robust.


Information

at heroku, we're always working towards improving operational stability with the services we offer. as we recently launched apache kafka on heroku, we've been increasingly focused on hardening apache kafka, as well as our automation around it. this particular improvement in stability concerns kafka's compacted topics, which we haven't talked about before. compacted topics are a powerful and important feature of kafka, and as of 0.9, provide the capabilities supporting a number of important features.

meet the bug

the bug we had been seeing is that an internal thread that's used by kafka to implement compacted topics (which we'll explain more of shortly) can die in certain use cases, without any notification. this leads to long-term failures and instability of the kafka cluster, as old data isn't cleared up as expected.

to set the stage for the changes we made and the deeper explanation of the bug, we'll cover what log compaction is briefly, and how it works:

just what are compacted topics anyway?

in the default case, kafka topics are a stream of messages:

[ a:1, b:2, c:3 ] 

there is no way to change or delete previous messages, except that messages that are too old get deleted after a specified "retention time."

compacted topics provide a very different type of stream, maintaining only the most recent message of a given key. this produces something like a materialized or table-like view of a stream, with up-to-date values for all things in the key space. these compacted topics work by assigning each message a "key" (a simple java byte[]), with kafka periodically tombstoning or deleting messages in the topic with superseded keys, or by applying a time-based retention window. this tombstoning of repeated keys provides you with a sort of eventual consistency, with the implication that duplicate messages or keys may be present before the cleaning or compaction process has completed.

while this doesn't give you real infinite storage -- you now have to care about what keys you assign and how big the space of keys grows -- it's a useful primitive for many systems.

at heroku we use compacted topics pretty sparingly. they're a much more special purpose tool than regular kafka topics. the largest user is the team who work on heroku's metrics feature, where they power threshold alerts. heroku connect is also starting to use them.

even when end users aren’t taking advantage of compacted topics, kafka makes extensive use of them internally: they provide the persistence and tracking of which offsets consumers and consumer groups have processed. this makes them an essential part of the codebase, so the reliability of compacted topics matters a lot.

how do compacted topics really work?

given the goal of "removing duplicate keys", how does kafka go about implementing this? there are a few important elements. first is that, on disk, kafka breaks messages up into "segments", which are plain files:

my_topic_my_partition_1: [ a:1, b:2, c:3] my_topic_my_partition_4: [ a:4, b:5] 

this notation uses key:offset to represent a message, as these are the primary attributes being manipulated for this task. compaction doesn't care about message values, except that the most recent value for each key is preserved.

secondly, a periodic process -- the log cleaner thread -- comes along and removes messages with duplicate keys. it does this by deleting duplicates only for new messages that have arrived since the last compaction. this leads to a nice tradeoff where kafka only requires a relatively small amount of memory to remove duplicates from a large amount of data.

the cleaner runs in two phases. in phase 1, it builds an "offset map", from keys to the latest offset for that key. this offset map is only built for "new" messages - the log cleaner marks where it got to when it finished. in phase 2, it starts from the beginning of the log, and rewrites one segment at a time, removing any message which has a lower offset than the message with that key in the offset map.

phase 1

data:

my_topic_my_partition_1: [ a:1, b:2, c:3] my_topic_my_partition_4: [ a:4, b:5 ] 

offset map produced:

{ a:4 b:5 c:3 } 

phase 2

my_topic_my_partition_1: [ a:1, b:2, c:3] my_topic_my_partition_4: [ a:4, b:5 ] 

breaking this cleaning down message-by-message:

  • for the first message, a:1, kafka looks in the offset map, finds a:4, so it doesn't keep this message.
  • for the second message, b:2, kafka looks in the offset map, finds b:5, so it doesn't keep this message.
  • for the third message, c:3, kafka looks in the offset map, and finds no newer message, so it keeps this message.

that's the end of the first segment, so the output is:

my_topic_my_partition_1: [ c:3 ] 

then we clean the second segment:

  • for the first message in the second segment, a:4, kafka looks in the offset map, finds a:4 and so it keeps this message
  • for the second message in the second segment, b:5, kafka looks in the offset map, finds b:5 and so it keeps this message

that's the end of this segment, so the output is:

my_topic_my_partition_4: [ a:4, b:5 ] 

so now, we've cleaned up to the end of the topic. we've elided a few details here, for example, kafka has a relatively complex protocol that enables rewriting whole topics in a crash safe way. secondly, kafka doesn't ever build an offset map for the latest segment in the log. this is just to prevent doing the same work over and over - the latest log segment sees a lot of new messages, so there's no sense in continually recompacting using it. lastly, there are some optimizations that mean small log segments get merged into larger files, which avoids littering the filesystem with lots of small files. the last part of the puzzle is that kafka writes down the highest offset in the offset map, for any key, that it last built the offset map to. in this case, offset 5.

let's see what happens when we add some more messages (again ignoring the fact that kafka never compacts using the last segment). in this case c:6 and a:7 are the new messages:

my_topic_my_partition_1: [ c:3 ] my_topic_my_partition_4: [ a:4, b:5 ] my_topic_my_partition_6: [ c:6, a:7 ] 

phase 1

build the offset map:

{ a: 7, c: 6, } 

note well, that the offset map doesn't include b:5! we already built the offset map (in the previous clean) up to that message, and our new offset map doesn't include a message with the key of b at all. this means the compaction process can use much less memory than you'd expect to remove duplicates.

phase 2

clean the log:

my_topic_my_partition_4: [ b:5 ] my_topic_my_partition_6: [ c:6, a:7 ] 

what is the bug again?

prior to the most recent version of kafka, the offset map had to keep a whole segment in memory. this simplified some internal accounting, but causes pretty gnarly problems, as it leads to the thread crashing if the map doesn't have enough space. the default settings have log segments grow up to 1gb of data, which at a very small message size can overwhelm the offset map with the sheer number of keys. then, having run out of space in the offset map without fitting in a full segment, an assertion fires and the thread crashes.

what makes this especially bad is kafka's handling of the thread crashing: there's no notification to an operator, the process itself carries on running. this violates a good fundamental principle that if you're going to fail, fail loudly and publicly.

with a broker running without this thread in the long term, data that is meant to be compacted grows and grows. this threatens the stability of the node, and if the crash impacts other nodes, the whole cluster.

what is the fix?

the fix was relatively simple, and a common theme in software: "stop doing that bad thing". after spending quite some time to understand the compaction behavior (as explained above), the code change was a simple 100 line patch. the fix means kafka doesn't try to fit a whole segment in the offset map and lets it instead mark "i got partway through a log segment when building the map".

the first step was to remove the assertion that caused the log cleaner thread to die. then, we reworked the internal tracking such that we can record a partial segment load and recover from that point.

the outcome now, is that the log cleaner thread doesn't die silently. this was a huge stress reliever for us - we've seen this happen in production multiple times, and recovering from it is quite tricky.

conclusion

working with the kafka community on this bug was a great experience. we filed a jira ticket and talked through potential solutions. after a short while, jun rao and jay kreps had a suggested solution, which was what we implemented. after some back and forth with code review, the patch was committed and made it into the latest release of kafka.

this fix is in kafka 0.10.1.1, which is now available and the default version on heroku. you can provision a new cluster like so:

$ heroku addons:create heroku-kafka 

for existing customers, you can upgrade to this release of kafka like so:

$ heroku kafka:upgrade heroku-kafka --version 0.10 

Information

the ruby maintainers continued their annual tradition by gifting us a new ruby version to celebrate the holiday: ruby 2.4 is now available and you can try it out on heroku.

ruby 2.4 brings some impressive new features and performance improvements to the table, here are a few of the big ones:

binding#irb

have you ever used p or puts to get the value of a variable in your code? if you’ve been writing ruby the odds are pretty good that you have. the alternative repl pry (http://pryrepl.org/) broke many of us of this habit, but installing a gem to get a repl during runtime isn’t always an option, or at least not a convenient one.

enter binding.irb, a new native runtime invocation for the irb repl that ships with ruby. now you can simply add binding.irb to your code to open an irb session and have a look around:

# ruby-2.4.0 class superconfusing def what_is_even_happening_right_now @x = @xy[:y] ** @x binding.irb # open a repl here to examine @x, @xy, # and possibly your life choices end end 

one integer to rule them all

ruby previously used 3 classes to handle integers: the abstract super class integer, the fixnum class for small integers and the bignum class for large integers. you can see this behavior yourself in ruby 2.3:

# ruby-2.3.3 irb> 1.class # => fixnum irb> (2**100).class # => bignum irb> fixnum.superclass # => integer irb> bignum.superclass # => integer 

ruby 2.4 unifies the fixnum and bignum classes into a single concrete class integer:

# ruby-2.4.0 irb> 1.class # => integer irb> (2**100).class # => integer 

why did we ever have two classes of integer?

to improve performance ruby stores small numbers in a single native machine word whenever possible, either 32 or 64 bits in length depending on your processor. a 64-bit processor has a 64-bit word length; the 64 in this case describes the size of the registers on the processor.

the registers allow the processor to handle simple arithmetic and logical comparisons, for numbers up to the word size, by itself; which is much faster than manipulating values stored in ram.

on my laptop it's more than twice as fast for me to add 1 to a fixnum a million times than it is to do the same with a bignum:

# ruby-2.3.3 require "benchmark" fixnum = 2**40 bignum = 2**80 n = 1_000_000 benchmark.bm do |x| x.report("adding #{fixnum.class}:") { n.times { fixnum + 1 } } x.report("adding #{bignum.class}:") { n.times { bignum + 1 } } end # => # user system total real # adding fixnum: 0.190000 0.010000 0.200000 ( 0.189790) # adding bignum: 0.460000 0.000000 0.460000 ( 0.471123) 

when a number is too big to fit in a native machine word ruby will store that number differently, automatically converting it to a bignum behind the scenes.

how big is too big?

well, that depends. it depends on the processor you’re using, as we’ve discussed, but it also depends on the operating system and the ruby implementation you’re using.

wait it depends on my operating system?

yes, different operating systems use different c data type models.

when processors first started shipping with 64-bit registers it became necessary to augment the existing data types in the c language, to accommodate larger register sizes and take advantage of performance increases.

unfortunately, the c language doesn't provide a mechanism for adding new fundamental data types. these augmentations had to be accomplished via alternative data models like lp64, ilp64 and llp64.

ll-what now?

lp64, il64 and llp64 are some of the data models used in the c language. this is not an exhaustive list of the available c data models but these are the most common.

the first few characters in each of these acronyms describe the data types they affect. for example, the "l" and "p" in the lp64 data model stand for long and pointer, because lp64 uses 64-bits for those data types.

these are the sizes of the relevant data types for these common data models:

| | int | long | long long | pointer | |-------|-----|------|-----------|---------| | lp64 | 32 | 64 | na | 64 | | ilp64 | 64 | 64 | na | 64 | | llp64 | 32 | 32 | 64 | 64 | 

almost all unix and linux implementations use lp64, including os x. windows uses llp64, which includes a new long long type, just like long but longer.

so the maximum size of a fixnum depends on your processor and your operating system, in part. it also depends on your ruby implementation.

fixnum size by ruby implementation

| fixnum range | min | max | |----------------------|-----------------|------------------| | 32-bit cruby (ilp32) | -2**30 | 2**30 - 1 | | 64-bit cruby (llp64) | -2**30 | 2**30 - 1 | | 64-bit cruby (lp64) | -2**62 | 2**62 - 1 | | jruby | -2**63 | 2**63 - 1 | 

the range of fixnum can vary quite a bit between ruby implementations.

in jruby for example a fixnum is any number between -263 and 263-1. cruby will either have fixnum values between -230 and 230-1 or -262 and 262-1, depending on the underlying c data model.

your numbers are wrong, you're not using all the bits

you're right, even though we have 64 bits available we're only using 62 of them in cruby and 63 in jruby. both of these implementations use two's complement integers, binary values that use one of the bits to store the sign of the number. so that accounts for one of our missing bits, how about that other one?

in addition to the sign bit, cruby uses one of the bits as a fixnum_flag, to tell the interpreter whether or not a given word holds a fixnum or a reference to a larger number. the sign bit and the flag bit are at opposite ends of the 64-bit word, and the 62 bits left in the middle are the space we have to store a number.

in jruby we have 63 bits to store our fixnum, because jruby stores both fixnum and bignum as 64-bit signed values; they don't need a fixnum_flag.

why are they changing it now?

the ruby team feels that the difference between a fixnum and a bignum is ultimately an implementation detail, and not something that needs to be exposed as part of the language.

using the fixnum and bignum classes directly in your code can lead to inconsistent behavior, because the range of those values depends on so many things. they don't want to encourage you to depend on the ranges of these different integer types, because it makes your code less portable.

unification also significantly simplifies ruby for beginners. when you're teaching your friends ruby you longer need to explain the finer points of 64-bit processor architecture.

rounding changes

in ruby float#round has always rounded floating point numbers up for decimal values greater than or equal to .5, and down for anything less, much as you learned to expect in your arithmetic classes.

# ruby-2.3.3 irb> (2.4).round # => 2 irb> (2.5).round # => 3 

during the development of ruby 2.4 there was a proposal to change this default rounding behavior to instead round to the nearest even number, a strategy known as half to even rounding, or gaussian rounding (among many other names).

# ruby-2.4.0-preview3 irb> (2.4).round # => 2 irb> (2.5).round # => 2 irb> (3.5).round # => 4 

the half to even strategy would only have changed rounding behavior for tie-breaking; numbers that are exactly halfway (.5) would have been rounded down for even numbers, and up for odd numbers.

why would anyone do that?

the gaussian rounding strategy is commonly used in statistical analysis and financial transactions, as the resulting values less significantly alter the average magnitude for large sample sets.

as an example let's generate a large set of random values that all end in .5:

# ruby-2.3.3 irb> halves = array.new(1000) { rand(1..1000) + 0.5 } # => [578.5...120.5] # 1000 random numbers between 1.5 and 1000.5 

now we'll calculate the average after forcing our sum to be a float, to ensure we don't end up doing integer division:

# ruby-2.3.3 irb> average = halves.inject(:+).to_f / halves.size # => 510.675 

the actual average of all of our numbers is 510.675, so the ideal rounding strategy should give us a rounded average be as close to that number as possible.

let's see how close we get using the existing rounding strategy:

# ruby-2.3.3 irb> round_up_average = halves.map(&:round).inject(:+).to_f / halves.size # => 511.175 irb> (average - round_up_average).abs # => 0.5 

we're off the average by 0.5 when we consistently round ties up, which makes intuitive sense. so let's see if we can get closer with gaussian rounding:

# ruby-2.3.3 irb> rounded_halves = halves.map { |n| n.to_i.even? ? n.floor : n.ceil } # => [578...120] irb> gaussian_average = rounded_halves.inject(:+).to_f / halves.size # => 510.664 irb> (average - gaussian_average).abs # => 0.011000000000024102 

it would appear we have a winner. rounding ties to the nearest even number brings us more than 97% closer to our actual average. for larger sample sets we can expect the average from gaussian rounding to be almost exactly the actual average.

this is why gaussian rounding is the recommended default rounding strategy in the ieee standard for floating-point arithmetic (ieee 754).

so ruby decided to change it because of ieee 754?

not exactly, it actually came to light because gaussian rounding is already the default strategy for the kernel#sprintf method, and an astute user filed a bug on ruby: "rounding modes inconsistency between round versus sprintf".

here we can clearly see the difference in behavior between kernel#sprintf and float#round:

# ruby 2.3.3 irb(main):001:0> sprintf('%1.0f', 12.5) # => "12" irb(main):002:0> (12.5).round # => 13 

the inconsistency in this behavior prompted the proposed change, which actually made it into one of the ruby 2.4 preview versions, ruby-2.4.0-preview3:

# ruby 2.4.0-preview3 irb(main):006:0> sprintf('%1.0f', 12.5) # => "12" irb(main):007:0> 12.5.round # => 12 

in ruby-2.4.0-preview3 rounding with either kernel#sprintf or float#round will give the same result.

ultimately matz decided this fix should not alter the default behavior of float#round when another user reported a bug in rails: "breaking change in how #round works".

the ruby team decided to compromise and add a new keyword argument to float#round to allow us to set alternative rounding strategies ourselves:

# ruby 2.4.0-rc1 irb(main):001:0> (2.5).round # => 3 irb(main):008:0> (2.5).round(half: :down) # => 2 irb(main):009:0> (2.5).round(half: :even) # => 2 

the keyword argument :half can take either :down or :even and the default behavior is still to round up, just as it was before.

why preview versions are not for production

interestingly before the default rounding behavior was changed briefly for 2.4.0-preview3 there was an unusual kernel#sprintf bug in 2.4.0-preview2:

# ruby 2.4.0-preview2 irb> numbers = (1..20).map { |n| n + 0.5 } # => => [1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5, 17.5, 18.5, 19.5, 20.5] irb> numbers.map { |n| sprintf('%1.0f', n) } # => ["2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "12", "14", "14", "16", "16", "18", "18", "20", "20"] 

in this example kernel#sprintf appears to be rounding numbers less than 12 up as though it was using the float#round method's default behavior, which was still in place at this point.

the preview releases before and after 2.4.0-preview2, both 2.4.0-preview1 and 2.4.0-preview3, show the expected sprintf behavior, consistent with ruby-2.3.3:

# ruby 2.4.0-preview1 irb> numbers.map { |n| sprintf('%1.0f', n) } # => ["2", "2", "4", "4", "6", "6", "8", "8", "10", "10", "12", "12", "14", "14", "16", "16", "18", "18", "20", "20"] # ruby 2.4.0-preview3 irb> numbers.map { |n| sprintf('%1.0f', n) } # => ["2", "2", "4", "4", "6", "6", "8", "8", "10", "10", "12", "12", "14", "14", "16", "16", "18", "18", "20", "20"] 

i discovered this by accident while researching this article and started digging through the 2.4.0-preview2 changes to see if i could identify the cause. i found this commit from nobu:

commit 295f60b94d5ff6551fab7c55e18d1ffa6a4cf7e3 author: nobu <[email protected]> date: sun jul 10 05:27:27 2016 +0000 util.c: round nearly middle value * util.c (ruby_dtoa): [experimental] adjust the case that the float value is close to the exact but unrepresentable middle value of two values in the given precision, as r55604. git-svn-id: svn+ssh:[email protected] b2dd03c8-39d4-4d8f-98ff-823fe69b080e 

kernel#sprintf accuracy in ruby 2.4

this was an early effort by nobu to handle cases where floating point numbers rounded inconsistently with kernel#sprintf in ruby-2.3.3 (and before):

# ruby-2.3.3 irb> numbers = (0..9).map { |n| "5.0#{n}5".to_f } # => [5.005, 5.015, 5.025, 5.035, 5.045, 5.055, 5.065, 5.075, 5.085, 5.095] irb> numbers.map { |n| sprintf("%.2f", n) } # => ["5.00", "5.01", "5.03", "5.04", "5.04", "5.05", "5.07", "5.08", "5.08", "5.09"] 

in the example above notice that 5.035 and 5.045 both round to 5.04. no matter what strategy kernel#sprintf is using this is clearly unexpected. the cause turns out to be the unseen precision beyond our representations.

not to worry though, the final version of nobu's fixes resolves this issue, and it will be available in ruby 2.4.

kernel#sprintf will now consistently apply half to even rounding:

# ruby-2.4.0-rc1 irb> numbers = (0..9).map { |n| "5.0#{n}5".to_f } # => [5.005, 5.015, 5.025, 5.035, 5.045, 5.055, 5.065, 5.075, 5.085, 5.095] irb> numbers.map { |n| sprintf("%.2f", n) } # => ["5.00", "5.02", "5.02", "5.04", "5.04", "5.06", "5.06", "5.08", "5.08", "5.10"] 

better hashes

ruby 2.4 introduces some significant changes to the hash table backing ruby's hash object. these changes were prompted by vladimir makarov when he submitted a patch to ruby's hash table earlier this year.

if you have a couple of hours to spare that issue thread is an entertaining read, but on the off-chance you're one of those busy developers i'll go through the major points here. first we need to cover some ruby hash basics.

if you're already an expert on ruby hash internals feel free to skip ahead and read about the specific hash changes in ruby 2.4.

how ruby implements hash

let's imagine for a moment that we have a severe case of "not invented here" syndrome, and we've decided to make our own hash implementation in ruby using arrays. i'm relatively certain we're about to do some groundbreaking computer science here so we'll call our new hash turbohash, as it's certain to be faster than the original:

# turbo_hash.rb class turbohash attr_reader :table def initialize @table = [] end end 

we'll use the @table array to store our table entries. we gave ourselves a reader to access it so it's easy to peek inside our hash.

we're definitely going to need methods to set and retrieve elements from our revolutionary hash so let's get those in there:

# turbo_hash.rb class turbohash # ... def [](key) # remember our entries look like this: # [key, value] find(key).last end def find(key) # enumerable#find here will return the first entry that makes # our block return true, otherwise it returns nil. @table.find do |entry| key == entry.first end end def []=(key, value) entry = find(key) if entry # if we already stored it just change the value entry[1] = value else # otherwise add a new entry @table << [key, value] end end end 

excellent, we can set and retrieve keys. it's time to setup some benchmarking and admire our creation:

require "benchmark" legacy = hash.new turbo = turbohash.new n = 10_000 def set_and_find(target) target = rand target[key] = rand target[key] end benchmark.bm do |x| x.report("hash: ") { n.times { set_and_find(legacy) } } x.report("turbohash: ") { n.times { set_and_find(turbo) } } end # user system total real # hash: 0.010000 0.000000 0.010000 ( 0.009026) # turbohash: 45.450000 0.070000 45.520000 ( 45.573937) 

well that could have gone better, our implementation is about 5000 times slower than ruby's hash. this is obviously not the way hash is actually implemented.

in order to find an element in @table our implementation traverses the entire array on each iteration; towards the end we're checking nearly 10k entries one at a time.

so let's come up with something better. the iteration is killing us, if we can find a way to index instead of iterating we'll be way ahead.

if we knew our keys were always going to be integers we could just store the values at their indexes inside of @table and look them up by their indexes later.

the issue of course is that our keys can be anything, we're not building some cheap knock-off hash that can only take integers.

we need a way to turn our keys into numbers in a consistent way, so "some_key" will give us the same number every time, and we can regenerate that number to find it again later.

it turns out that the object#hash is perfect for this purpose:

irb> "some_key".hash # => 3031662902694417109 irb> "some_other_key".hash # => -3752665667844152731 irb> "some_key".hash # => 3031662902694417109 

the object#hash will return unique(ish) integers for any object in ruby, and you'll get the same number back every time you run it again with an object that's "equal" to the previous object.

for example, every time you create a string in ruby you'll get a unique object:

irb> a = "some_key" # => "some_key" irb> a.object_id # => 70202008509060 irb> b = "some_key" # => "some_key" irb> b.object_id # => 70202008471340 

these are clearly distinct objects, but they will have the same object#hash return value because a == b:

irb> a.hash # => 3031662902694417109 irb> b.hash # => 3031662902694417109 

these hash return values are huge and sometimes negative, so we're going to use the remainder after dividing by some small number as our index instead:

irb> a.hash % 11 # => 8 

we can use this new number as the index in @table where we store the entry. when we want to look up an item later we can simply repeat the operation to know exactly where to find it.

this raises another issue however, our new indexes are much less unique than they were originally; they range between 0 and 10. if we store more than 11 items we are certain to have collisions, overwriting existing entries.

rather than storing the entries directly in the table we'll put them inside arrays called "bins". each bin will end up having multiple entries, but traversing the bins will still be faster than traversing the entire table.

armed with our new indexing system we can now make some improvements to our turbohash.

our @table will hold a collection of bins and we'll store our entries in the bin that corresponds to key.hash % 11:

# turbo_hash.rb class turbohash num_bins = 11 attr_reader :table def initialize # we know our indexes will always be between 0 and 10 # so we need an array of 11 bins. @table = array.new(num_bins) { [] } end def [](key) find(key).last end def find(key) # now we're searching inside the bins instead of the whole table bin_for(key).find do |entry| key == entry.first end end def bin_for(key) # since hash will always return the same thing we know right where to look @table[index_of(key)] end def index_of(key) # a pseudorandom number between 0 and 10 key.hash % num_bins end def []=(key, value) entry = find(key) if entry entry[1] = value else # store new entries in the bins bin_for(key) << [key, value] end end end 

let's benchmark our new and improved implementation:

 user system total real hash: 0.010000 0.000000 0.010000 ( 0.012918) turbohash: 3.800000 0.010000 3.810000 ( 3.810126) 

so that's pretty good i guess, using bins decreased the time for turbohash by more than 90%. those sneaky ruby maintainers are still crushing us though, let's see what else we can do.

it occurs to me that our benchmark is creating 10_000 entries but we only have 11 bins. each time we iterate through a bin we're actually going over a pretty large array now.

let's check out the sizes on those bins after the benchmark finishes:

bin: relative size: length: ---------------------------------------- 0 +++++++++++++++++++ (904) 1 ++++++++++++++++++++ (928) 2 +++++++++++++++++++ (909) 3 ++++++++++++++++++++ (915) 4 +++++++++++++++++++ (881) 5 +++++++++++++++++++ (886) 6 +++++++++++++++++++ (876) 7 ++++++++++++++++++++ (918) 8 +++++++++++++++++++ (886) 9 ++++++++++++++++++++ (952) 10 ++++++++++++++++++++ (945) 

that's a nice even distribution of entries but those bins are huge. how much faster is turbohash if we increase the number of bins to 19?

 user system total real hash: 0.020000 0.000000 0.020000 ( 0.021516) turbohash: 2.870000 0.070000 2.940000 ( 3.007853) bin: relative size: length: ---------------------------------------- 0 ++++++++++++++++++++++ (548) 1 +++++++++++++++++++++ (522) 2 ++++++++++++++++++++++ (547) 3 +++++++++++++++++++++ (534) 4 ++++++++++++++++++++ (501) 5 +++++++++++++++++++++ (528) 6 ++++++++++++++++++++ (497) 7 +++++++++++++++++++++ (543) 8 +++++++++++++++++++ (493) 9 ++++++++++++++++++++ (500) 10 +++++++++++++++++++++ (526) 11 ++++++++++++++++++++++ (545) 12 +++++++++++++++++++++ (529) 13 ++++++++++++++++++++ (514) 14 ++++++++++++++++++++++ (545) 15 ++++++++++++++++++++++ (548) 16 +++++++++++++++++++++ (543) 17 ++++++++++++++++++++ (495) 18 +++++++++++++++++++++ (542) 

we gained another 25%! that's pretty good, i bet it gets even better if we keep making the bins smaller. this is a process called rehashing, and it's a pretty important part of a good hashing strategy.

let's cheat and peek inside st.c to see how ruby handles increasing the table size to accommodate more bins:

/* https://github.com/ruby/ruby/blob/ruby_2_3/st.c#l38 */ #define st_default_max_density 5 #define st_default_init_table_size 16 

ruby's hash table starts with 16 bins. how do they get away with 16 bins? weren't we using prime numbers to reduce collisions?

we were, but using prime numbers for hash table size is really just a defense against bad hashing functions. ruby has a much better hashing function today than it once did, so the ruby maintainers stopped using prime numbers in ruby 2.2.0.

what's this other default max density number?

the st_default_max_density defines the average maximum number of entries ruby will allow in each bin before rehashing: choosing the next largest power of two and recreating the hash table with the new, larger size.

you can see the conditional that checks for this in the add_direct function from st.c:

/* https://github.com/ruby/ruby/blob/ruby_2_3/st.c#l463 */ if (table->num_entries > st_default_max_density * table->num_bins) {...} 

ruby's hash table tracks the number of entries as they're added using the num_entries value on table. this way ruby doesn't need to count the entries to decide if it's time to rehash, it just checks to see if the number of entries is more than 5 times the number of bins.

let's implement some of the improvements we stole from ruby to see if we can speed up turbohash:

class turbohash starting_bins = 16 attr_accessor :table def initialize @max_density = 5 @entry_count = 0 @bin_count = starting_bins @table = array.new(@bin_count) { [] } end def grow # use bit shifting to get the next power of two and reset the table size @bin_count = @bin_count << 1 # create a new table with a much larger number of bins new_table = array.new(@bin_count) { [] } # copy each of the existing entries into the new table at their new location, # as returned by index_of(key) @table.flatten(1).each do |entry| new_table[index_of(entry.first)] << entry end # finally we overwrite the existing table with our new, larger table @table = new_table end def full? # our bins are full when the number of entries surpasses 5 times the number of bins @entry_count > @max_density * @bin_count end def [](key) find(key).last end def find(key) bin_for(key).find do |entry| key == entry.first end end def bin_for(key) @table[index_of(key)] end def index_of(key) # use @bin_count because it now changes each time we resize the table key.hash % @bin_count end def []=(key, value) entry = find(key) if entry entry[1] = value else # grow the table whenever we run out of space grow if full? bin_for(key) << [key, value] @entry_count += 1 end end end 

so what's the verdict?

 user system total real hash: 0.010000 0.000000 0.010000 ( 0.012012) turbohash: 0.130000 0.010000 0.140000 ( 0.133795) 

we lose. even though our turbohash is now 95% faster than our last version, ruby still beats us by an order of magnitude.

all things considered, i think turbohash fared pretty well. i'm sure there are some ways we could further improve this implementation but it's time to move on.

at long last we have enough background to explain what exactly is about to nearly double the speed of ruby hashes.

what actually changed

speed! ruby 2.4 hashes are significantly faster. the changes introduced by vladimir makarov were designed to take advantage of modern processor caching improvements by focusing on data locality.

this implementation speeds up the ruby hash table benchmarks in average by more 40% on intel haswell cpu.

https://github.com/ruby/ruby/blob/trunk/st.c#l93

oh good! what?

processors like the intel haswell series use several levels of caching to speed up operations that reference the same region of memory.

when the processor reads a value from memory it doesn't just take the value it needs; it grabs a large piece of memory nearby, operating on the assumption that it is likely going to be asked for some of that data in the near future.

the exact algorithms processors use to determine which bits of memory should get loaded into each cache are somewhat difficult to discover. manufacturers consider these strategies to be trade secrets.

what is clear is that accessing any of the levels of caching is significantly faster than going all the way out to pokey old ram to get information.

how much faster?

real numbers here are almost meaningless to discuss because they depend on so many factors within a given system, but generally speaking we can say that l1 cache hits (the fastest level of caching) could speed up memory access by two orders of magnitude or more.

an l1 cache hit can complete in half a nanosecond. for reference consider that a photon can only travel half a foot in that amount of time. fetching from main memory will generally take at least 100 nanoseconds.

got it, fast... therefore data locality?

exactly. if we can ensure that the data ruby accesses frequently is stored close together in main memory, we significantly increase our chances of winning a coveted spot in one of the caching levels.

one of the ways to accomplish this is to decrease the overall size of the entries themselves. the smaller the entries are, the more likely they are to end up in the same caching level.

in our turbohash implementation above our entries were stored as simple arrays, but in ruby-2.3.3 table entries were actually stored in a linked list. each of the entries contained a next pointer that pointed to the next entry in the list. if we can find a way to get by without that pointer and make the entries smaller we will take better advantage of the processor's built-in caching.

the new approach in ruby.2.4.0-rc1 actually goes even further than just removing the next pointer, it removes the entries themselves. instead we store the entries in a separate array, the "entries array", and we record the indexes for those entries in the bins array, referenced by their keys.

this approach is known as "open addressing".

open addressing

ruby has historically used "closed addressing" in its hash table, also known as "open hashing". the new alternative approach proposed by vladimir makarov uses "open addressing", also known as "closed hashing". i get that naming things is hard, but this can really get pretty confusing. for the rest of this discussion, i will only use open addressing to refer to the new implementation, and closed addressing to refer to the former.

the reason open addressing is considered open is that it frees us from the hash table. the table entries themselves are not stored directly in the bins anymore, as with a closed addressing hash table, but rather in a separate entries array, ordered by insertion.

open addressing uses the bins array to map keys to their index in the entries array.

let's set a value in an example hash that uses open addressing:

# ruby-2.4.0-rc1 irb> my_hash["some_key"] = "some_value" 

when we set "some_key" in an open addressing hash table ruby will use the hash of the key to determine where our new key-index reference should live in the bins array:

irb> "some_key".hash # => -3336246172874487271 

ruby first appends the new entry to the entries array, noting the index where it was stored. ruby then uses the hash above to determine where in the bins array to store the key, referencing that index.

remember that the entry itself is not stored in the bins array, the key only references the index of the entry in the entries array.

determining the bin

the lower bits of the key's hash itself are used to determine where it goes in the bins array.

because we're not using all of the available information from the key's hash this process is "lossy", and it increases the chances of a later hash collision when we go to find a bin for our key.

however, the cost of potential collisions is offset by the fact that choosing a bin this way is significantly faster.

in the past, ruby has used prime numbers to determine the size of the bins array. this approach gave some additional assurance that a hashing algorithm which didn't return evenly distributed hashes would not cause a single bin to become unbalanced in size.

the bin size was used to mod the computed hash, and because the bin size was prime, it decreased the risk of hash collisions as it was unlikely to be a common factor of both computed hashes.

since version 2.2.0 ruby has used bin array sizes that correspond to powers of two (16, 32, 64, 128, etc.). when we know the bin size is going to be a factor of two we're able to use the lower two bits to calculate a bin index, so we find out where to store our entry reference much more quickly.

what's wrong with prime modulo mapping?

dividing big numbers by primes is slow. dividing a 64-bit number (a hash) by a prime can take more than 100 cpu cycles for each iteration, which is even slower than accessing main memory.

even though the new approach may produce more hash collisions, it will ultimately improve performance, because collisions will probe the available bins linearly.

linear probing

the open addressing strategy in ruby 2.4 uses a "full cycle linear congruential generator".

this is just a function that generates pseudorandom numbers based on a seed, much like ruby's rand#rand method.

given the same seed the rand#rand method will generate the same sequence of numbers, even if we create a new instance:

irb> r = random.new(7) # => #<random:0x007fee63030d50> irb> r.rand(1..100) # => 48 irb> r.rand(1..100) # => 69 irb> r.rand(1..100) # => 26 irb> r = random.new(7) # => #<random:0x007fee630ca928> irb> r.rand(1..100) # => 48 irb> r.rand(1..100) # => 69 irb> r.rand(1..100) # => 26 # note that these values will be distinct for separate ruby processes. # if you run this same code on your machine you can expect to get different numbers. 

similarly a linear congruential generator will generate the same numbers in sequence if we give it the same starting values.

linear congruential generator (lcg)

this is the algorithm for a linear congruential generator:

xn+1 = (a * xn + c ) % m

for carefully chosen values of a, c, m and initial seed x0 the values of the sequence x will be pseudorandom.

here are the rules for choosing these values:

  • m must be greater than 0 (m > 0)
  • a must be greater than 0 and less than m (0 < a < m)
  • c must be greater than or equal to 0 and less than m (0 <= c < m)
  • x0 must be greater than or equal to 0 and less than m (0 <= x0 < m)

implemented in ruby the lcg algorithm looks like this:

irb> a, x_n, c, m = [5, 7, 3, 16] # => [5, 7, 3, 16] irb> x_n = (a * x_n + c) % m # => 6 irb> x_n = (a * x_n + c) % m # => 1 irb> x_n = (a * x_n + c) % m # => 8 

for the values chosen above that sequence will always return 6, 1 and 8, in that order. because i've chosen the initial values with some additional constraints, the sequence will also choose every available number before it comes back around to 6.

an lcg that returns each number before returning any number twice is known as a "full cycle" lcg.

full cycle linear congruential generator

for a given seed we describe an lcg as full cycle when it will traverse every available state before returning to the seed state.

so if we have an lcg that is capable of generating 16 pseudorandom numbers, it's a full cycle lcg if it will generate a sequence including each of those numbers before duplicating any of them.

irb> (1..16).map { x_n = (a * x_n + c) % m }.sort # => [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] 

these are the additional rules we must use when choosing our starting values to make an lcg full cycle:

  • c can't be 0 (c != 0)
  • m and c are relatively prime (the only positive integer that divides both of them is 1)
  • (a - 1) is divisible by all prime factors of m
  • (a - 1) is divisible by 4 if m is divisible by 4

the first requirement makes our lcg into a "mixed congruential generator". any lcg with a non-zero value for c is described as a mixed congruential generator, because it mixes multiplication and addition.

if c is 0 we call the generator a "multiplicative" congruential generator (mcg), because it only uses multiplication. an mcg is also known as a lehmer random number generator (lrng).

the last 3 requirements in the list up above make a mixed cycle congruential generator into a full cycle lcg. those 3 rules by themselves are called the hull-dobell theorem.

hull-dobell theorem

the hull-dobell theorem describes a mixed congruential generator with a full period (one that generates all values before repeating).

in ruby 2.4 vladimir has implemented an lcg that satisfies the hull-dobell theorem, so ruby will traverse the entire collection of bins without duplication.

remember that the new hash table implementation uses the lower bits of a key's hash to find a bin for our key-index reference, a reference that maps the entry's key to its index in the entries table.

if the first attempt to find a bin for a key results in a hash collision, future attempts will use a different means of calculating the hash.

the unused bits from the original hash are used with the collision bin index to generate a new secondary hash, which is then used to find the next bin.

when the first attempt results in a collision the bin searching function becomes a full cycle lcg, guaranteeing that we will eventually find a home for our reference in the bins array.

since this open addressing approach allows us to store the much smaller references to entries in the bins array, rather than the entirety of the entries themselves, we significantly decrease the memory required to store the bins array.

the new smaller bins array then increases our chances of taking advantage of the processor caching levels, by keeping this frequently accessed data structure close together in memory. vladimir improved the data locality of the ruby hash table.

so ruby is faster and vladimir is smart?

yup! we now have significantly faster hashes in ruby thanks to vladimir and a whole host of other ruby contributors. please make sure you make a point of thanking the ruby maintainers the next time you see one of them at a conference.

contributing to open source can be a grueling and thankless job. most of the time contributors only hear from users when something is broken, and maintainers can sometimes forget that so many people are appreciating their hard work every day.

want to make a contribution yourself?

the best way to express your gratitude for ruby is to make a contribution.

there are all sorts of ways to get started contributing to ruby, if you're interested in contributing to ruby itself check out the ruby core community page.

another great way to contribute is by testing preview versions as they’re released, and reporting potential bugs on the ruby issues tracker. watch the recent news page (rss feed) to find out when new preview versions are available.

if you don't have the time to contribute to ruby directly consider making a donation to ruby development:

is that everything new in ruby 2.4?

not even close. there are many more interesting updates to be found in the ruby 2.4 changelog.

here are a few of my favorites that i didn't have time to cover:

thank you so much for reading, i hope you have a wonderful holiday.

":heart:" jonan


Information

during the development of the recently released heroku ssl feature, a lot of work was carried out to stabilize the system and improve its speed. in this post, i will explain how we managed to improve the speed of our tls handshakes by 4-5x.

the initial reports of speed issues were sent our way by beta customers who were unhappy about the low level of performance. this was understandable since, after all, we were not greenfielding a solution for which nothing existed, but actively trying to provide an alternative to the ssl endpoint add-on, which is provided by a dedicated team working on elastic load balancers at aws. at the same time, another of the worries we had was to figure out how many more routing instances we would need to absorb the cpu load of a major part of our traffic no longer being simple http, but http + tls.

detecting the problem

the simplest way to work at both of these things is always through benchmarking. so we set up some simple benchmarks that used tls with no session resumption or caching, and on http requests that were extremely small with no follow-up requests coming over the connection. the objective there was to specifically exercise handshakes:

  • tls handshakes are more costly than just encrypting data over a well-established connection.
  • http keep-alive requests reuse the connection and the first one is technically more expensive (since it incurs the cost of the handshake), so we disabled them to always have the most expensive thing happening.
  • small amounts of data to encrypt mean that the weight of the handshake dominates the overall amount of time taken for each query. we wanted more handshakes and less of everything else.

under such benchmarks, we found that our nodes became slow and unresponsive at a depressingly rapid rate, almost 7-8 times earlier than with the same test over http. this is a far cry from reported overheads of 1%-2% in places like google, although we purposefully went with a very pessimistic pattern. a more realistic pattern would likely have had lower overhead.

this could have easily have been a call for panic. we had used the erlang/otp ssl libraries to serve our traffic. while there's some added safety to having a major part of your stack not written in c and not dependent on openssl, which recently has experienced several notable vulnerabilities, we did run the risk of much worse performance. to be clear, the erlang ssl library does use openssl bindings for all of the cryptographic functionality but uses erlang for protocol-level parts of the implementation, such as running state machines. the library has gone through independent audit and testing.

we have a team of people who know erlang pretty well and are able to profile and debug it, so we decided to see if we could resolve the performance issue just tweaking standard configurations. during initial benchmarking, we found that bandwidth and packet counts were very low and memory usage was as expected. cpu usage was fairly low (~30%) but did tend to jump around quite a bit.

for us, this pointed to a standard kind of bottleneck issue where you see flapping as some work gets done in a batch, then the process waits, then does more work. however, from our internal tools, we could see very little that was not just ssl doing work.

eventually, we used perf top to look at things, which is far less invasive than most tooling out there (if you're interested, you should take a look at julian squire's talk at the erlang user conference 2016 on this specific topic).

the thing that immediately jumped out at us was that a bunch of functions were taking far more time than we'd expect. specifically, the following results would show up:

perf top output for the node

the copy_shallow and do_minor functions are related to garbage collection operations within the erlang vm. they can be high in regular circumstances, but here they were much, much higher than expected. in fact, gc was taking more time than actual crypto work! the other thing that took more time than crypto was the db_next_hash function, which was a bit funny.

we looked some more, and as more samples came out, the pattern emerged:

perf top output for the node

cpu time would flap a lot between a lot of garbage collection operations and a lot of db_*_hash operations, whereas given the benchmark, we would have expected libcrypto.so to do the most work.

the db_*_hash operations are specifically related to something called ets (erlang term storage) tables. ets tables are an efficient in-memory database included with the erlang virtual machine. they sit in a part of the virtual machine where destructive updates are allowed and where garbage collection dares not approach. they're generally fast, and a pretty easy way for erlang programmers to optimize some of their code when parts of it get too slow.

in this case, though, they appeared to be our problem. specifically, the next and get operations are expected to be cheap, but select tends to be very expensive and a sign of full-table scans.

by logging onto the node during the benchmark, we could make use of erlang's tracing facilities. the built-in tracing functions basically let an operator look at all messages, function calls and function returns, garbage collections, scheduling activities, data transiting in and out of ports, and so on, at the language level. this tracing is higher level than tracing provided by tools such as strace or dtrace.

we simply ran calls to recon_trace:calls({ets, select_delete, '_'}, 100) and saw that all of the calls came from a single process named ssl_manager.

understanding and fixing the problem

the ssl manager in the erlang library has been a potential bottleneck for a long period of time. prior to this run, we had already disabled all kinds of cache implementations that turned out to be slower for our use cases. we had also identified this lone, central process as a point of contention by design -- we have a few of them and tend to know about them as specific load scenarios exercise them more than others.

the tracing above had also shown that the trace calls were made from a module called ssl_pkix_db, which is in turn called by ssl_manager.

these modules were used by the ssl implementation as a cache for intermediary certificates for each connection. the initial intent was to cache certificates read from disk by erlang, say from /etc/ssl/.

for us, this would have been costly when the server is fetching the files from disk hundreds or thousands of times a second, decoding them from pem to der format, and subsequently to their internal erlang format.

the cache was set in place such that as soon as one connection requested a given file, it would get decoded once, and then cached in memory through ets tables with a reference count. each new session would increment the count, and each terminated session would decrement the count. when the count reaches 0, the cache is dropped. additional table scans would take place to provide upper time limits for caches. the cache was then used for other operations, such as scanning through ca certs during revocation checks.

the funny bit there is that the ssl library supports an entire other format to handle certificates: have the user decode the pem data to the der format themselves and then submit the der data directly in memory through the connection configuration. for dynamic routers such as heroku's, that's the method we've taken to avoid storing multiple certificates on disk unencrypted. we store them ourselves, using appropriate cryptographic means, and then decode them only once they are requested through sni, at which point they are passed to the ssl library.

for this use case, the ssl library has certificates passed straight in memory and does not require file access. still, the same caching code path was used. the certificates are cached but not reference-counted and also not shared across connections either. for heavy usage of der-encoded certificates, the pem cache, therefore, becomes a central bottleneck for the entire server, forcing the decoding of every certificate individually through that single critical process.

patching results

we decided to write a very simple patch whose purpose is just to bypass the cache wholesale, nothing more. once the patch was written, we needed to test it. we ran it through our benchmark sequence and the capacity of each instance instantly tripled, with response times now much lower than before.

it soon came to the step of sending our first canary nodes into production to see how they'd react with real-world data rather than benchmark cases. it didn't take too long to see the improvements:

deploy results for a canary node

the latencies on the node for all ssl/tls queries over sni had instantly dropped by 400% to 500% -- from 100-120 milliseconds roundtrip down to a range of 20-25 milliseconds for the particular subset of apps we were looking at. that was a lot of overhead.

in fact, the canary node ran for a while and beat every other node we had on the platform:

deploy results for a canary node vs. all the others

it seemed a bit surprising that so much time was spent only through the bottleneck, especially since our benchmark case was more pessimistic than our normal production load. then we looked at garbage collection metrics:

erlang gc diagram

the big dip in the middle is the node restarting to run the new software. what's interesting is that as soon as the new version was used, the garbage collection count went from about 3 million gcs per minute down to roughly 2.5 million gcs per minute. the reclaimed words are impacted more significantly: from around 7 billion words reclaimed down to 4 billion words reclaimed per minute.

since erlang garbage collections are per-process, non-blocking, and generational, it is expected to see a lot of small garbage collections and a few larger ones. the overall count includes both values. each garbage collection tends to take a very short time.

what we know for a fact, though, is that we had this one bottleneck of a process and that a lot of time was saved. the supposition is that because a lot of requests (in fact, all of the sni termination requests) had to touch this one erlang process, it would have time to accumulate significant amounts of garbage for short periods of time. this garbage collection impacted the latency of this process and the scheduler it was on, but not the others; processes on other cores could keep running fine.

this yielded a pattern where a lot of data coming from the other 15 cores could pile up into the message queue of the ssl manager, up to the point that memory pressure forced it to gc more and more often. at regular intervals, a more major gc would take place, and stall an even larger number of requests.

by removing the one small bottleneck, the level of garbage generated was more equally shared by all callers and also tended to be much more short-lived. in fact, if the caller was a short-lived process, its memory would be reclaimed on termination without any garbage collection taking place.

the results were so good that we rapidly deployed them to the rest of the platform:

overall deploy latency graph

those values represented the median (orange), the 95th percentile (blue), and the 99th percentile (green). you can see the progression of the deploy as the median rapidly shrinks, and the point where the deploy finishes when the 99th percentile responses became faster than our previous medians.

the patch has been written, tested, cleaned up, and sent upstream to the otp team at ericsson (see the pull request). it has recently been shipped along with erlang otp-19.1, and everyone can now use it in the releases starting then.


Information

the heroku connect team ran into problems with existing task scheduling libraries. because of that, we wrote redbeat, a celery beat scheduler that stores scheduled tasks and runtime metadata in redis. we’ve also open sourced it so others can use it. here is the story of why and how we created redbeat.

background

heroku connect, makes heavy use of celery to synchronize data between salesforce and heroku postgres. over time, our usage has grown, and we came to rely more and more heavily on the beat scheduler to trigger frequent periodic tasks. for a while, everything was running smoothly, but as we grew cracks started to appear. beat, the default celery scheduler, began to behave erratically, with intermittent pauses (yellow in the chart below) and occasionally hanging (red in the chart below). hangs would require manual intervention, which led to an increased pager burden.

redbeat-before

out of the box, beat uses a file-based persistent scheduler, which can be problematic in a cloud environment where you can’t guarantee beat will restart with access to the same filesystem. of course, there are ways to solve this, but it requires introducing more moving parts to manage a distributed filesystem. an immediate solution is to use your existing sql database to store the schedule and django-celery, which we were using, allows you to do this easily.

after digging into the code, we discovered the hangs were due to blocked transactions in the database and the long pauses were caused by periodic saving and reloading of the schedule. we could mitigate this issue by increasing the time between saves, but this also increases the likelihood that we'd lose data. in the end, it was evident that django-celery was a poor fit for this pattern of frequent schedule updates.

we were already using redis as our celery broker, so we decided to investigate moving the schedule into redis as well. there is an existing celerybeatredis package, but it suffers from the same design issues as django-celery, requiring a pause and full reload to pick up changes.

so we decided to create a new package, redbeat, which takes advantage of the inherent strengths of redis. we’ve been running it in production for over a year and have not seen any recurrences of the problems we were having with the django-celery based scheduler.

the redbeat difference

how is redbeat different? the biggest change is that the active schedule is stored in redis rather than within process space of the celery beat daemon.

no longer does creating or modifying a task require beat to pause and reload, we just update a key in redis and beat will pick up the change on the next tick. a nice side-effect of this is it’s trivial to make updates to the schedule from other languages. as with django-celery, we no longer need to worry about sharing a file across multiple machines to preserve metadata about when tasks were last run. startup and shutdown times improved since we don't suffer from load spikes caused by having to save and reload the entire schedule from the database. rather, we have a steady, predictable load on redis.

finally, we added a simple lock that prevents multiple beat daemons from running concurrently. this can sometimes be a problem for heroku customers when they scale up from a single worker or during development.

after converting to redbeat, we’ve had no scheduler related incidents.

redbeat-after

needless to say, so far we’ve been happy with redbeat and hope others will find it useful too.

why not take it for a spin and let us know what you think?


Information

at emberconf terence lee and i had a chance to sit down with tom dale and chat about the history of ember.js and where it’s headed now, including some details on the newly extracted glimmer.js rendering engine. this post details a lot of the history of ember, including some of the motivation that led the framework to what it is today. watch the blog for the second portion of this interview with all of the details on glimmer.js. the next post will also include the full audio of the interview, with many questions we opted to omit from the transcription to save valuable bytes.

jonan: so, we're at emberconf speaking with tom dale, who gave a keynote today with some important announcements. we're going to dig into those in just a minute here, but i’d like you to introduce yourselves please.

tom: sure. hey, i'm tom. i just started working at linkedin as a senior staff software engineer, and i work on a really awesome team that works on ember infrastructure. as you may have seen, linkedin’s website now is one big ember application. so my job is to make the army of engineers at linkedin productive, and make sure that we're able to build a really awesome web software.

terence: i'm terence, i do language stuff and rails on the languages team [at heroku].

jonan: there's a third-party ember buildpack that you worked on, right?

terence: yes. that has no javascript in it.

jonan: no javascript at all? but it ships ember. i shipped my first ember app on it.

tom: that's not true.

terence: it is true.

tom: it's all ruby?

terence: oh, yeah.

tom: awesome. see that's great. you know what, ember is a big tent, as dhh would say. not about ember, he would say that about rails and then i would copy that because that's basically what we do. we just take what dhh says, and we repeat them in the context of javascript, and it sounds very thought leadery.

jonan: would you describe ember as omakase?

tom: i would describe it as being bespoke, artisanal, shade-grown omakase.

jonan: that's even better. so on the subject of ember.it's been around for awhile now. how old is ember? five years plus?

tom: it depends on what date you want to use. so if you're talking about ember 1.0, i think it's been about five years.

terence: do you include sproutcore in that?

tom: i mean i think we should. there is no ember without sproutcore, and to me sproutcore was one of the first libraries or frameworks to adopt this idea of client-side architecture. so one thing that we talked about in the keynote yesterday was just how much the web has changed in five years, right? so five years ago, ie was the dominant browser but actually, sproutcore had it way worse. and we're talking about ie6 and ie7 and talking about ambitious things, what we do on the web.

jonan: and you did it in an era where browsers were not even close to where they are today.

tom: not even close, not even close.

jonan: that's interesting. so then, from sproutcore, ember comes out five years ago and we're off to the races. a lot changed in that first year, you went 1.0 and you’ve said that there were a lot of things that went wrong along the way. in your talk, you had a slide where you mentioned a few of those things. from the 10,000-foot view, what kind of lessons did you learn in those first 5 years?

tom: javascript apps felt broken and people didn’t know why but people always said, "javascript apps feel broken, you know, for whatever reason, please don’t use them" right? and people wanted to shame you for using javascript. the reason for that, i think, is urls. urls are kind of the linchpin that holds the web together. and so much of the value of the web over native is these urls, and javascript apps just ignored them. sproutcore ignored them, and almost every javascript framework did. so, what ember had to do was figure out how to build javascript apps that don’t feel broken on the web. that’s where all this work with the router started.

nowadays, routers are taken for granted. every framework, every library has a router that you can drop into it. but there was actually some novel computer science work that went into it, in how we tie the architecture of the app to the url. that took a long time and it was a very organic process. i don’t think we fully understood the magnitude of the research project that was going on. there are a lot of examples of that where we tackled problems for the first time, so of course, there's gonna be kind of an organic exploration of that space.

another example of this is that when we adopted the six-week release cycle, this train model with canary beta and the release, the only other people doing it were chrome and, i think, firefox. and when we adopted it, it paid dividends right away, and i'm so happy that we adopted it. one constraint that we have that chrome and firefox don’t have as much is that for us, we're always shipping the framework over the wire every time a user visits your webpage, right?

jonan: right.

tom: so it's very easy to have feature flags and to keep all the apis around when you're distributing a binary. it's much harder when every time you do that, your file size goes up, and up, and up. and so what we've had to figure out is okay, "well, we really liked this release train model. people really like the fact that it's backwards compatible. people really don’t like ecosystem breaking changes like python 2 to python 3 or angular 1 to angular 2. that doesn’t work so what do we do?"

you know, you feel kind of stuck. so we've had to figure out a lot of things. like one thing that we've been working on is something called project svelte, which is the ability to say, "you can opt out of deprecated features and we will strip those from the build".

jonan: but that's the only way that you can really move forward there. i mean if you've got to make this smaller, you can't just deprecate things arbitrarily.

tom: right.

jonan: you can't make those decisions for your user. your file size is ever growing, which when you're shipping over the wire, is not a great thing.

this has already, historically, been an issue for ember, the size of the framework.

so what you are providing people now is a way to opt out of some those deprecated features. so say that, "all right, i've stopped using this api in my codebase, we can strip this out."

that's known as project svelte?

tom: yeah, that's project svelte. it's really important to remember that when ember started, there were no package managers. npm wasn’t 1.0 or just hit 1.0, and was not at all designed for frontend packages. it didn’t do any kind of deduplication and distributing.

this is back in the day when the way that you got a library was you googled for the website, you found it, they gave you a script tag to just drop in. i'm sure you all agree that's a horrible way to do dependency management.

so we felt compelled to say, "well, if we wanna make something… if we want people to actually use something, we have to bake it in." because when you're gathering all your dependencies by hand, you're only gonna have, you know, four or five of them. you're not gonna go get a million dependencies. of course, that has changed dramatically and we have new technology like yarn, which is more like a cargo/bundler style of dependency resolution for javascript.

what we found has not worked is trying to do big-design, upfront projects, because anything that we land in ember gets that guarantee of stability and compatibility.

people feel a very strong sense of responsibility, that if we land this feature, this has to be something that we are ready to support for the foreseeable future, and that just takes longer. it's the same reason standards bodies move relatively slowly.

jonan: right. now, this is something you brought up in your keynote. rather than architecting or spending a huge amount of time and investment upfront architecting your system, you want to get it out in front of the customers as early as possible. but that conflicts with the idea that you're trying to present stable products, things that won't change, right?

terence: stability without stagnation is the tagline right?

tom: right. so that's the message but then we also know that you can't do a big design upfront, and you're not gonna get it perfect the first time. you ship an mvp and iterate.

so how do you balance this tension? if you look at the projects we've embarked on in the last couple of years, there have been some projects that were more big design upfront. and those have largely stagnated and failed because of the fact that we just couldn’t get consensus on them.

then you have some other projects like ember engines and fastboot. what we actually did was look at how web standards bodies work -- cc39, w3c, whatwg.

there's something called the "extensible web manifesto," which you may have seen, that says "hey, standard bodies, open source libraries are able to iterate a lot faster than you are. so instead of focusing on building these big, beautiful, really-easy-to-use declarative apis, give us the smallest primitive needed to experiment on top of that."

that’s something that we really take to heart in ember 2. if you think of ember as being this small stable core, what we can do is expose just the smallest primitive that you need, and then we can let the experimentation happen in the community.

so with fastboot, for example, fastboot is this entire suite of tools for deploying server-side rendered, client-side apps. you can easily push it to heroku and, boom, it starts running, but that doesn’t need to live in ember. we can do all the http stuff, all of the concurrency stuff. all of that can live outside of ember, all ember needs to say is, "give me a url and i will give you the html for that back."

so that's what we did. there's this api called visit, the ‘visit’ method. you call it, you give the url, you get html back, and it's so simple and you can easily have discussion about it.

you can understand how it's gonna operate and that's the thing that we landed. then that's given us a year to experiment in fastboot and make a lot of really important changes.

jonan: you were able to hide the complexity away behind this simple api.

tom: right.

jonan: so some of the things that more recently you mentioned in your keynote as not having gone well, were ember pods, for example, and now we have module unification. so if i understand correctly, ember pods was a way to keep all of your component files related to a single component in one location?

tom: right. the rails style where you have one directory that's all controllers and one directory that's all views or templates, which is how ember started. it's still the standard way, the default way you get when you create a new ember app.

people found it more productive to say, "i'm gonna have a feature directory", where you have your component and that component might have style. it might have javascript, it might have templates. i think it's just easier for people to reason about those when they're all grouped together, instead of bouncing around.

jonan: i love this idea. when i first came into rails, i distinctly remember going from file to file and thinking, "where even is this thing. how do i find this?"

so you had said that ember pods, maybe, didn’t seem to take off? it wasn't a very popular solution to that problem, and now we have module unifications. how is that different?

tom: i actually think that pods was popular, it actually was very popular. so, there's something dhh says: "beginners and pro users should climb the mountain together."

i think it's a bad sign, in your framework, if there's the documented happy path that beginners use, and then at some point, they fall off the cliff and see "oh, actually there's this pro api. it's a little bit harder to use but now that you're in the club, now you get to use it". i think that leads to very bad experiences for both. you kind of wanna have both sets of people going up the same route.

so pods is almost this secret opt-in handshake. and it was just one of those things where it started off as an experiment but then slowly became adopted to the point where, i think, we didn’t move fast enough.

jonan: i see.

tom: we didn’t move fast enough and now, there's almost this bifurcation between apps that are not using pods and apps that are using pods.

so with module unification what we did is we sat down and we said "ok, pods was a really nice improvement but it didn’t have a ton of design applied to it. it was the kind of thing that evolved organically. so let's just sit down and try to design something."

for us, it was really important with module unification to say, "not only does it need to be good but we need to have a way of being able to automatically migrate 99% of the ember apps today. we should have a command that will just migrate them to the new file system."

so one thing that's really neat is that you can just have a component where all you have to do is drag it into another component's directory and now it's scoped. it's almost like a lexical scope in a programming language. we're using the file system to scope which components know about each other.

jonan: so, forgive my simplification here but i'm not great at ember. if i have a login component and it's a box to just log in, and inside of it i wanted to have a google auth button and a twitter auth button, each of those could be independent components.

maybe i wanna reuse it somehow. i would drag them into my login directory and that makes them scoped, so we can't use them somewhere else.

tom: right. that ends up being pretty nice because often, what you'll do is you'll create a new component, give it a really nice and appropriate semantic name and, oops, it turns out your coworker used that for another page, a year ago. now, you can't use it, because it’s completely different.

jonan: so i've got my ember app and i've been using pods all this time, and now, we have module unification and there's a new way to do this. i can just move over to module unification right?

tom: yes.

jonan: we run this script that you've written and it would migrate me over?

tom: yeah. so we have a migrator and because there's so many ember apps using the classic system, so many ember apps using the pod system, it can handle both.

terence: could module unification have happened without ember pods happening first?

tom: it's hard to say. i think it's something that people really wanted, and i think it's fantastic. this is something we touched on the keynote; one thing that we've always said about ember, and i think this is true about rails also, is that there's always a period of experimentation when something new comes along. you really want that experimentation to happen in the community. then eventually, it seems like one idea has won out in a lot of ways. the things that we learned about with pods fed directly into module unification design.

jonan: so maybe, we could chat a little bit about deprecating controllers in ember?

tom: sure, yeah.

jonan: you announced that you were going to deprecate all of the top-level controllers by 2.0, and then pushed 2.1 and 2.2. that's still the plan to deprecate the controllers someday?

tom: i think what we are always dedicated to is trying to slim down the programming model and always reevaluate what is the experience like for new people. i don’t want to say that we're going to deprecate controllers because that sounds like a very scary thing, right? there's a lot of people with a lot of controllers in their apps. but i do think what we will want to do is take a look at the ember programming model from the perspective of a new user. and say, "well, it seems like people already learned about components. and it seems like there's probably some overlap between what a controller does and what a component does."

so maybe there's some way we can unify these concepts so people don’t have to learn about this controller thing with its own set of personality quirks.

jonan: is this where routable components fit into the idea then?

tom: so that's the idea of routable components and i think i don’t have a concrete plan for exactly how this is going to work. i think a lot of ways, the work that we want to do on that was blocked by the glimmer component api.

i think what we'd like to do is add whatever low-level hooks in ember are needed so that we can maybe do some experimentation around things like routable components outside. let people get a feel for it and then once we have a design that we're really happy with, then we can land it back in mainland ember.

that’s the end of our discussion on the history and direction of the ember project. stay tuned for part two and learn more about the glimmer.js project.


Information

at emberconf terence lee and i had a chance to sit down with tom dale and chat about the history of ember.js and where it’s headed now, including some details on the newly extracted glimmer.js rendering engine. this post details a lot of the history of ember, including some of the motivation that led the framework to what it is today. watch the blog for the second portion of this interview with all of the details on glimmer.js. the next post will also include the full audio of the interview, with many questions we opted to omit from the transcription to save valuable bytes.

jonan: so, we're at emberconf speaking with tom dale, who gave a keynote today with some important announcements. we're going to dig into those in just a minute here, but i’d like you to introduce yourselves please.

tom: sure. hey, i'm tom. i just started working at linkedin as a senior staff software engineer, and i work on a really awesome team that works on ember infrastructure. as you may have seen, linkedin’s website now is one big ember application. so my job is to make the army of engineers at linkedin productive, and make sure that we're able to build a really awesome web software.

terence: i'm terence, i do language stuff and rails on the languages team [at heroku].

jonan: there's a third-party ember buildpack that you worked on, right?

terence: yes. that has no javascript in it.

jonan: no javascript at all? but it ships ember. i shipped my first ember app on it.

tom: that's not true.

terence: it is true.

tom: it's all ruby?

terence: oh, yeah.

tom: awesome. see that's great. you know what, ember is a big tent, as dhh would say. not about ember, he would say that about rails and then i would copy that because that's basically what we do. we just take what dhh says, and we repeat them in the context of javascript, and it sounds very thought leadery.

jonan: would you describe ember as omakase?

tom: i would describe it as being bespoke, artisanal, shade-grown omakase.

jonan: that's even better. so on the subject of ember.it's been around for awhile now. how old is ember? five years plus?

tom: it depends on what date you want to use. so if you're talking about ember 1.0, i think it's been about five years.

terence: do you include sproutcore in that?

tom: i mean i think we should. there is no ember without sproutcore, and to me sproutcore was one of the first libraries or frameworks to adopt this idea of client-side architecture. so one thing that we talked about in the keynote yesterday was just how much the web has changed in five years, right? so five years ago, ie was the dominant browser but actually, sproutcore had it way worse. and we're talking about ie6 and ie7 and talking about ambitious things, what we do on the web.

jonan: and you did it in an era where browsers were not even close to where they are today.

tom: not even close, not even close.

jonan: that's interesting. so then, from sproutcore, ember comes out five years ago and we're off to the races. a lot changed in that first year, you went 1.0 and you’ve said that there were a lot of things that went wrong along the way. in your talk, you had a slide where you mentioned a few of those things. from the 10,000-foot view, what kind of lessons did you learn in those first 5 years?

tom: javascript apps felt broken and people didn’t know why but people always said, "javascript apps feel broken, you know, for whatever reason, please don’t use them" right? and people wanted to shame you for using javascript. the reason for that, i think, is urls. urls are kind of the linchpin that holds the web together. and so much of the value of the web over native is these urls, and javascript apps just ignored them. sproutcore ignored them, and almost every javascript framework did. so, what ember had to do was figure out how to build javascript apps that don’t feel broken on the web. that’s where all this work with the router started.

nowadays, routers are taken for granted. every framework, every library has a router that you can drop into it. but there was actually some novel computer science work that went into it, in how we tie the architecture of the app to the url. that took a long time and it was a very organic process. i don’t think we fully understood the magnitude of the research project that was going on. there are a lot of examples of that where we tackled problems for the first time, so of course, there's gonna be kind of an organic exploration of that space.

another example of this is that when we adopted the six-week release cycle, this train model with canary beta and the release, the only other people doing it were chrome and, i think, firefox. and when we adopted it, it paid dividends right away, and i'm so happy that we adopted it. one constraint that we have that chrome and firefox don’t have as much is that for us, we're always shipping the framework over the wire every time a user visits your webpage, right?

jonan: right.

tom: so it's very easy to have feature flags and to keep all the apis around when you're distributing a binary. it's much harder when every time you do that, your file size goes up, and up, and up. and so what we've had to figure out is okay, "well, we really liked this release train model. people really like the fact that it's backwards compatible. people really don’t like ecosystem breaking changes like python 2 to python 3 or angular 1 to angular 2. that doesn’t work so what do we do?"

you know, you feel kind of stuck. so we've had to figure out a lot of things. like one thing that we've been working on is something called project svelte, which is the ability to say, "you can opt out of deprecated features and we will strip those from the build".

jonan: but that's the only way that you can really move forward there. i mean if you've got to make this smaller, you can't just deprecate things arbitrarily.

tom: right.

jonan: you can't make those decisions for your user. your file size is ever growing, which when you're shipping over the wire, is not a great thing.

this has already, historically, been an issue for ember, the size of the framework.

so what you are providing people now is a way to opt out of some those deprecated features. so say that, "all right, i've stopped using this api in my codebase, we can strip this out."

that's known as project svelte?

tom: yeah, that's project svelte. it's really important to remember that when ember started, there were no package managers. npm wasn’t 1.0 or just hit 1.0, and was not at all designed for frontend packages. it didn’t do any kind of deduplication and distributing.

this is back in the day when the way that you got a library was you googled for the website, you found it, they gave you a script tag to just drop in. i'm sure you all agree that's a horrible way to do dependency management.

so we felt compelled to say, "well, if we wanna make something… if we want people to actually use something, we have to bake it in." because when you're gathering all your dependencies by hand, you're only gonna have, you know, four or five of them. you're not gonna go get a million dependencies. of course, that has changed dramatically and we have new technology like yarn, which is more like a cargo/bundler style of dependency resolution for javascript.

what we found has not worked is trying to do big-design, upfront projects, because anything that we land in ember gets that guarantee of stability and compatibility.

people feel a very strong sense of responsibility, that if we land this feature, this has to be something that we are ready to support for the foreseeable future, and that just takes longer. it's the same reason standards bodies move relatively slowly.

jonan: right. now, this is something you brought up in your keynote. rather than architecting or spending a huge amount of time and investment upfront architecting your system, you want to get it out in front of the customers as early as possible. but that conflicts with the idea that you're trying to present stable products, things that won't change, right?

terence: stability without stagnation is the tagline right?

tom: right. so that's the message but then we also know that you can't do a big design upfront, and you're not gonna get it perfect the first time. you ship an mvp and iterate.

so how do you balance this tension? if you look at the projects we've embarked on in the last couple of years, there have been some projects that were more big design upfront. and those have largely stagnated and failed because of the fact that we just couldn’t get consensus on them.

then you have some other projects like ember engines and fastboot. what we actually did was look at how web standards bodies work -- cc39, w3c, whatwg.

there's something called the "extensible web manifesto," which you may have seen, that says "hey, standard bodies, open source libraries are able to iterate a lot faster than you are. so instead of focusing on building these big, beautiful, really-easy-to-use declarative apis, give us the smallest primitive needed to experiment on top of that."

that’s something that we really take to heart in ember 2. if you think of ember as being this small stable core, what we can do is expose just the smallest primitive that you need, and then we can let the experimentation happen in the community.

so with fastboot, for example, fastboot is this entire suite of tools for deploying server-side rendered, client-side apps. you can easily push it to heroku and, boom, it starts running, but that doesn’t need to live in ember. we can do all the http stuff, all of the concurrency stuff. all of that can live outside of ember, all ember needs to say is, "give me a url and i will give you the html for that back."

so that's what we did. there's this api called visit, the ‘visit’ method. you call it, you give the url, you get html back, and it's so simple and you can easily have discussion about it.

you can understand how it's gonna operate and that's the thing that we landed. then that's given us a year to experiment in fastboot and make a lot of really important changes.

jonan: you were able to hide the complexity away behind this simple api.

tom: right.

jonan: so some of the things that more recently you mentioned in your keynote as not having gone well, were ember pods, for example, and now we have module unification. so if i understand correctly, ember pods was a way to keep all of your component files related to a single component in one location?

tom: right. the rails style where you have one directory that's all controllers and one directory that's all views or templates, which is how ember started. it's still the standard way, the default way you get when you create a new ember app.

people found it more productive to say, "i'm gonna have a feature directory", where you have your component and that component might have style. it might have javascript, it might have templates. i think it's just easier for people to reason about those when they're all grouped together, instead of bouncing around.

jonan: i love this idea. when i first came into rails, i distinctly remember going from file to file and thinking, "where even is this thing. how do i find this?"

so you had said that ember pods, maybe, didn’t seem to take off? it wasn't a very popular solution to that problem, and now we have module unifications. how is that different?

tom: i actually think that pods was popular, it actually was very popular. so, there's something dhh says: "beginners and pro users should climb the mountain together."

i think it's a bad sign, in your framework, if there's the documented happy path that beginners use, and then at some point, they fall off the cliff and see "oh, actually there's this pro api. it's a little bit harder to use but now that you're in the club, now you get to use it". i think that leads to very bad experiences for both. you kind of wanna have both sets of people going up the same route.

so pods is almost this secret opt-in handshake. and it was just one of those things where it started off as an experiment but then slowly became adopted to the point where, i think, we didn’t move fast enough.

jonan: i see.

tom: we didn’t move fast enough and now, there's almost this bifurcation between apps that are not using pods and apps that are using pods.

so with module unification what we did is we sat down and we said "ok, pods was a really nice improvement but it didn’t have a ton of design applied to it. it was the kind of thing that evolved organically. so let's just sit down and try to design something."

for us, it was really important with module unification to say, "not only does it need to be good but we need to have a way of being able to automatically migrate 99% of the ember apps today. we should have a command that will just migrate them to the new file system."

so one thing that's really neat is that you can just have a component where all you have to do is drag it into another component's directory and now it's scoped. it's almost like a lexical scope in a programming language. we're using the file system to scope which components know about each other.

jonan: so, forgive my simplification here but i'm not great at ember. if i have a login component and it's a box to just log in, and inside of it i wanted to have a google auth button and a twitter auth button, each of those could be independent components.

maybe i wanna reuse it somehow. i would drag them into my login directory and that makes them scoped, so we can't use them somewhere else.

tom: right. that ends up being pretty nice because often, what you'll do is you'll create a new component, give it a really nice and appropriate semantic name and, oops, it turns out your coworker used that for another page, a year ago. now, you can't use it, because it’s completely different.

jonan: so i've got my ember app and i've been using pods all this time, and now, we have module unification and there's a new way to do this. i can just move over to module unification right?

tom: yes.

jonan: we run this script that you've written and it would migrate me over?

tom: yeah. so we have a migrator and because there's so many ember apps using the classic system, so many ember apps using the pod system, it can handle both.

terence: could module unification have happened without ember pods happening first?

tom: it's hard to say. i think it's something that people really wanted, and i think it's fantastic. this is something we touched on the keynote; one thing that we've always said about ember, and i think this is true about rails also, is that there's always a period of experimentation when something new comes along. you really want that experimentation to happen in the community. then eventually, it seems like one idea has won out in a lot of ways. the things that we learned about with pods fed directly into module unification design.

jonan: so maybe, we could chat a little bit about deprecating controllers in ember?

tom: sure, yeah.

jonan: you announced that you were going to deprecate all of the top-level controllers by 2.0, and then pushed 2.1 and 2.2. that's still the plan to deprecate the controllers someday?

tom: i think what we are always dedicated to is trying to slim down the programming model and always reevaluate what is the experience like for new people. i don’t want to say that we're going to deprecate controllers because that sounds like a very scary thing, right? there's a lot of people with a lot of controllers in their apps. but i do think what we will want to do is take a look at the ember programming model from the perspective of a new user. and say, "well, it seems like people already learned about components. and it seems like there's probably some overlap between what a controller does and what a component does."

so maybe there's some way we can unify these concepts so people don’t have to learn about this controller thing with its own set of personality quirks.

jonan: is this where routable components fit into the idea then?

tom: so that's the idea of routable components and i think i don’t have a concrete plan for exactly how this is going to work. i think a lot of ways, the work that we want to do on that was blocked by the glimmer component api.

i think what we'd like to do is add whatever low-level hooks in ember are needed so that we can maybe do some experimentation around things like routable components outside. let people get a feel for it and then once we have a design that we're really happy with, then we can land it back in mainland ember.

that’s the end of our discussion on the history and direction of the ember project. stay tuned for part two and learn more about the glimmer.js project.


Information

this is the second of a two-part transcript from a recent interview with tom dale of ember.js. in part one we discussed the history and direction of the ember.js project. continuing the discussion of the future for ember.js, this post includes the rest of the interview, primarily focused on the glimmer.js project. some of the questions were omitted from these transcriptions for brevity, so we’re also releasing the nearly hour long audio file of the entire interview. enjoy!

jonan: let’s talk about glimmer 2. if i understand correctly it's released now and it entirely supplants ember. so how are you planning to gracefully sunset the project?

terence: i think locks (ricardo mendes) talked about how people already have five years of ember experience, they can now move on to this glimmer thing, right?

tom: that's right, yeah. you can put five years of glimmer experience on your resume, on your linkedin profile. you know, something we really wanted to be mindful of is that it's really easy to think that we're giving up on ember or that we just, declared bankruptcy and we’re starting over again fresh. because actually, this is what happens in the javascript community all the time, right? new version, backwards incompatible, we decided that it was just too clunky to ever fix.

terence: right. angular 1 and angular 2 thing?

tom: something like that, right?

jonan: and in some cases, that's the right choice.

terence: yeah, i think it is.

jonan: in the first version, there were mistakes made, let's move on. there is no right choice in those circumstances. maybe it's the only choice that you have.

tom: right. so glimmer is a little bit different. the first thing to understand is that glimmer is an extraction, it's not a brand-new library.

one piece of feedback that we get all the time is people say, "you know, i would, theoretically, be interested in using ember but i don’t need all of that stuff. i don’t need a router, i don’t need a data layer. i just want components. i have a rails app and i just wanna do some kind of interactive widget." people use jquery, they use react, things that are really easy to drop in, super simple to learn.

so, we thought about it and said, "well, you know, we actually have this awesome rendering library in this engine called glimmer. why don’t we make it available to people who don’t wanna buy into ember?"

you know it shouldn’t be an all-or-nothing thing. we should try to think about how we can bring incremental value to people. so that's one. it's not a new project. it's an extraction.

the other thing is that i don’t think about glimmer as being a different project. i think about glimmer as being a way for us to experiment with the future of the component api in ember. so one thing that we're working on right now, and actually there is an rfc written by godfrey chan, is an api that lets people write plug-ins that implement different component apis.

remember, linkedin is an ember app. it’s an ember app that has had a lot of money and a lot of work put into it, and i promise you, we're not just gonna throw that away and rewrite it in glimmer.

so we really need to focus on the experience of taking glimmer components and bringing them into an ember app; that's what we're working on right now. glimmer components, i think of it as the future of the ember component api.

what i would really love is that people can start working on glimmer applications, see that it has this beautiful ui, it's fast, it's slick, all these things. then they realize, "hey, actually, maybe i need ember data, maybe i need a router?" and then what they'll do is just take these glimmer components, drag and drop them into their ember app and, boom, they just work without having to change a line of code.

jonan: ember includes a lot of things. it's prepared to handle problems that you can't foresee yet which is one of the benefits of using a framework. but that means that it's larger than maybe you need in the moment. so i could start with a very small glimmer app, and glimmer itself is small, right?

tom: yeah, it's really small.

jonan: so the advantage right now though, because we don’t have that component transferability, we can't just take an ember component and take it into glimmer today, is that it’s small. you described it as useful for a mobile application. the example you gave in the keynote was a temperature widget that gave the temperature in various cities.

give us like a real-world use case of glimmer.

tom: sure. i mean i can give you a very real-world use case, which is that before i joined linkedin, i was working at this really awesome little startup in new york called monograph. one of the things monograph was doing was building these e-commerce apps that were designed to integrate into social media apps.

so one thing that's really amazing about the web that you can't do on native, is you can actually run inside of other apps. you can run inside of facebook, you can run inside of twitter, you can run inside of any social media app and that's something that native apps can't do.

what we wanted to do was build this experience that felt very native but it also had to load really, really fast. because you didn’t get to load it until a user tapped the link. so we actually tried to build a few prototypes in ember and they actually worked really great on the iphone, but then we had some investors in australia on android phones, and when they tapped on it, it took like 10 seconds to load. that's just not acceptable when you're trying to target these mobile apps.

i said "we have this great entry engine in ember and it's really small. i wonder if i can hack together something using this?" and the reality was that if we couldn’t, i was gonna have to use something like react or something.

so i told my boss "give me a week. give me a week to see if i can do it in glimmer. i have this crazy idea, let's see if we can do it.", and we actually pulled it off.

we've run a few campaigns now and things have sold out in like an hour. so the model works.

i think if you're building an app that needs a router and uses a data layer, yeah, you should be absolutely using ember. this is definitely a pared down experience, but my hope is that we're gonna figure out ways of taking these big apps and kind of slicing them up in a way that will be good for mobile.

jonan: i just want to make sure i’ve got the technical details right here. ember rendered javascript in the past, and now it is rendering bytecode: a series of opcodes that are interpreted on the client side by a couple of different vms. you have an update vm and you have a render vm. so the first time that you load up a page, you're gonna send over some bytecode and that's gonna be interpreted by this typescript vm, the render vm, and then the updates will come into the update vm in the next round? okay. and so the actual content of this bytecode, what is that?

tom: it's a json object. the reason for that is, of course, json is a much smaller subset of the javascript language. so they're more compact and they're much faster to parse.

modern javascript vms are very fast and very good at doing just-in-time compilation to native code. so if we emit javascript, those will get compiled into these really fast operations.

the problem that we didn’t realize at the time was that when you have apps that grow, that is a lot of javascript, and all that javascript gets parsed eagerly. now you are in the situation where you're spending all this time parsing javascript. for some parts of that page, it doesn’t need to get parsed because it never gets rendered.

jonan: so in the olden days, again, i need to simplify this for my own thinking here. i have a page with a home and an about page, right?

tom: mm-hmm.

jonan: and i don’t ever click on the about tab. but that javascript is still there.

tom: it's still loaded.

jonan: and it's still interpreted.

tom: still parsed, right.

jonan: and it's not necessary, right?

tom: right.

jonan: so now, in this new world, the json blob that represents that about page, if the user never clicks on that link, it never actually has to get turned into anything.

tom: right. we keep it in memory, resident in memory as a string, and we just-in-time json parse it. and of course, the json parsing is gonna be faster than the javascript parsing because of the fact that it's so restricted.

jonan: i see. and so then, you can take that json and turn directly into the page that you need. there's no other step there?

tom: right.

jonan: i see, okay.

tom: so one of the advantages of handlebars, you know, there's kind of this raging debate about if you should use templates like ember, and glimmer, and view, jsdo or should you use jsx like react and all of the stuff react likes.

one of the really nice things about handlebars is that it's very statically analyzable. we can very easily analyze your template and say, "okay, these are the components you're using. these are the helpers that you're using. this is the structure." that lets us do more work when you deploy, when you build and that means less work on each user's browser.

jonan: right, but then also, as you talked about in your keynote, maybe it comes at the expense of file size in some cases which is another problem that glimmer solves. because what you're sending over the wire is actually much smaller now.

tom: right. so that was the other downside of doing the compilation with the javascript and that’s just the representation. i mean think about javascript as a syntax you have, you have functions, and you have var, and you have all of these different things. by creating our json structure, we can get that down.

jonan: so, we've been talking about progressive web apps a lot this year, and there are a lot of things that have happened recently that enabled progressive web apps to actually be a thing. we can now say reliably that you can present an offline experience that is close to the real experience with the possible exception of mobile safari. i've heard that that's a popular browser.

tom: it is.

jonan: something like 55% of the u.s. market is using an iphone.

tom: that's right.

jonan: so they don’t have service workers that's the problem here, right? i wanna just explain real quick. a service worker, for my own thinking, is this thread that i can run in the background, and i can schedule tasks on it. so it doesn’t even mean you have to have my page open, right?

tom: right.

jonan: i can go and refresh the data that i'm caching locally.

tom: the most important thing about the service worker, from my perspective, the thing that it unlocked in terms of taking something that usually only the browser can do, is now giving me, as a javascript programmer, access to intercepting network requests.

not just javascript but literally, i can have a service worker and if i put an image tag on my page and my service worker is consulted saying, "hey, we're about to go fetch this image. would you like to give me a version from cache?"

that is hugely powerful when you're talking about building an offline experience. because now you have programmatic access to the browser cache, to the way it looks at resources. so now, you have this very powerful abstraction for building whatever caching you want offline.

jonan: so whatever possible request could be coming from your application is more or less proxied through this server?

tom: exactly. so in addition to their request, you also have access to browser cache. so you can put things in, you can take things out. that's what lets you program very specific rules. because you don’t always wanna say use from the cache, right? sometimes, there are things that you actually want like how many items in inventory remain, right? you probably don’t want that cached. you probably wanna have the most updated information possible.

jonan: we don’t have service workers in safari and we won't for the foreseeable future.

tom: well, we don’t have it in safari but we have it in firefox and we have it in chrome. you know, the p in pwa, it stands for progressive web app, so you can progressively add this to your application. you know i think the best way to get features into a browser is to adopt them and say "hey, if you're using an android phone, you have this really awesome experience. but if you have an iphone, you know, maybe it's not as awesome."

apple, i truly believe, really cares about the user experience. if there's one thing i've got from the safari team is that they always prioritize making a feature fast and not draining the user's battery over being able to check a check mark.

so i actually have a lot of respect for their position, and i think if they do service workers, they're going to do it right. if they see that people are having a better user experience on an android phone than an iphone that is hugely motivating for them.

terence: does the service worker impact the batteries on phones?

tom: it could, it could, yeah. i think what browser vendors are going to have to figure out is what is the right heuristic for making sure that we can run a service worker, but only in service of the user, pardon the pun.

how do we make sure that people aren't using it maliciously? how do i make sure this website is not mining bitcoin on your phone and now your battery life is two hours, you know?

jonan: sure, yeah.

tom: it's a really tricky problem.

jonan: even if they're relatively innocuous. they don’t necessarily need to be malicious. if you've got a hundred of them and they're all just trying to fetch the same image online, this will dramatically impact your phone's performance.

tom: yeah, absolutely. or if you think about, you know, you install a native app and all of a sudden, you start getting push notifications, that's not great for your battery life either.

terence: i guess, you talked about progressive web apps in last year’s keynote, what has the uptake been since then? i know it was kind of a work in progress kind of thing, and we just saw two talks yesterday related to progressive web apps.

tom: yup.

terence: so has the adoption been pretty strong within the community?

tom: yeah, absolutely. i think people are really excited about it. i think there are so many aspects to progressive web apps, and i think the definition isn't clear exactly. it's one of these terms that people talk about. sometimes, it becomes more of a buzzword than a very concrete thing. there are a lot of things that you can do on the path to a progressive web app.

so service worker, as jonan said, is the one thing that people think about the most, but there are also things like server-side rendering, to make sure that the first few bytes that you sent to the user are in service of getting content in front of them. not just loading your dependency injection library.

jonan: right.

tom: you really wanna get the content first. there's the ability to run offline, there's the ability to add to the home screen as a first-class icon, the ability to do push notifications.

jonan: removing the browser chrome, making it feel like a native app experience.

tom: yup, and actually, android has done some really awesome work here to make a progressive web integrate into the operating system such that as a user you can't really tell. that’s the dream.

jonan: yeah, of course.

tom: the community uptake has been phenomenal, and this is exactly one of those things where it's gonna require experimentation. this is a brand new thing. people don’t know the optimal way to use it yet and that experimentation is happening.

there are a ton of ember add-ons: there are service worker add-ons, there are add-ons for adding an app manifest so you get the icon. all sorts of cool stuff.

i think what we should start thinking about is, "okay, well what is the mature part of this that we can start baking into the default experience when we make a new ember app, such that you get a pwa for free?", and i would guess that we are probably on the way there, sometime this year or early next year. saying that "you just get this really awesome pwa out of the box when you make a new ember app."

jonan: that will be fantastic. i would like that very much.

tom: defaults are important. i think if you care about the web, especially the mobile web being fast, the highest impact thing you can do is find out what developers are doing today and make the default the right thing.

terence: so do you imagine in the next couple years, pwa and fastboot are just going to be baked into new ember apps?

tom: i certainly hope so. i don’t think we want to do it before it's ready. fastboot, in particular of course, introduces a server-side dependency.

one thing that people really like about client side apps is that i don’t need to run my own server, i can just upload to some kind of cdn. that's nice, i don’t like doing ops. that's why i use heroku so i don’t have to think about ops. so that's the hard thing about server-side rendering, it does introduce computation requirements when you deploy.

so i don’t know if fastboot will ever be the default per se, but i do know that i want to make it really easy and at least give people the option.

"hey, server-side rendering is really important for many kinds of apps. do you wanna turn it on?" the pwa stuff, i think we can do it within the existing parameters of being able to do static deploys, so yeah, let's do it.

terence: if you have fastboot on the app it’s totally optional though right?

tom: yes, totally optional.

terence: you can still deploy the assets and ignore fastboot completely, even if it was part of the standard app, right?

tom: that's true. yeah, that's true, and really that, i think, is the beauty of client-side architecture plus server-side rendering. "oh, my server is over capacity." well, you can just have your load balancer fall back to the static site, and maybe the user doesn’t get the first view as fast but they still get the full experience.

so much of what fastboot is, is this conventional way of having not just the server-side rendering but also having a good user experience around it. so much of that relies on the good bits of ember, the very conventional structure. so i think glimmer will rapidly support server-side rendering but massaging that into an easy-to-use thing is, i think, an ember responsibility.

jonan: the vms that we're talking about, with the glimmer on the frontend, the updated render vms, are written in typescript.

tom: that's right.

jonan: you mentioned during your keynote that there were some features you added to typescript 2.2, or worked with the typescript team to add to typescript 2.2. and 2.3, to enable glimmer? is that me misunderstanding something?

tom: it's not enabling glimmer per se, because glimmer 2 from the beginning has been written in typescript. i think when they started typescript was on 1.8, so when you make a new glimmer app, the default is to get typescript. that just works out of the box, because the library is written in typescript you get awesome code completion, you get intellisense, you get documentation in line, all these things automatically.

i can't say enough positive things about the typescript team. they are so professional, they are so responsive. we even asked daniel rosenwasser, who is the pm, last week "hey do you wanna come to emberconf next week?" "i will come, because i really want to meet the ember community." they're really, really wonderful.

so for glimmer, the internals, because it's written in typescript, there were really no problems. but the thing that they realized is, "hey, there's actually this long tail of libraries that come from earlier versions of javascript like when es3 and es5 were kind of cutting edge, that built their own object model on top of javascript."

so if you look at ember for, example, you have the ember objects model where you have .get and .set, and you have ember.object.extend and ember.object.create. before we had es6 classes, we had no choice but to build our own on top of the language. the problem is we need some way to let typescript know, "hey, when we call ember.object.extend, that's not some random method, that's actually defining a type. that's defining a class."

the typescript team has been really awesome saying, "okay, how do we rationalize that and add the extension points where a system like ember or…" i mean here's the thing. every javascript library from that era has their own system like this, so they've built these really awesome primitives in typescript that let you express key types or mapped types.

"hey, when you see ember.object.extend, we're gonna pass it to pojo, plain old javascript object as an argument. that's not just a bag of data. i want you to actually look at the keys inside of that object and treat those like types."

so that's the thing we're really excited about because, of course, you're going to be writing glimmer apps, you're going to be writing glimmer components.

you're going to get these really nice typescript features but then we don’t want you to have to go back to ember code and miss those benefits.

jonan: that's a fantastic feature to have in a language and it's a difficult thing to bring yourself to add, i would imagine, if you're maintaining something like typescript. i think this is a smart way to approach the problem.

tom: yes.

jonan: but you're looking at all of these people with their own legacy object models and i have an object model now, and i want people to use the object model that exists in this language. right?

tom: exactly, yes.

jonan: how do i let you also just roll your own object model? it's a pretty fundamental part of a programming language.

tom: it is, yeah, and that' what i mean about professionalism. i really, really appreciate the typescript team thinking so carefully about adoption, because i think it really requires maturity to do that. how do we bridge the gap, reach people where they are today? and then we can slowly bring them into the new, modern world as they do new things. i think that's hugely important and i think it's one thing that many people in the javascript community undervalue. it is such a breath of fresh air to see it from typescript.

jonan: that's great.

terence: yeah. it seems to align a lot with all the stuff ember tries to do with the way it does it features.

jonan: so at the very end of the keynote… you ran a little long on the keynote which is a rare thing to see.

tom: yeah, yeah, very rare.

jonan: this year, you were overtime a little bit and you flipped through some content very quickly at the end. i was hoping maybe you could give us a peek at some of those things you didn’t get time to talk about in your keynote, that you wish you had time to mention.

tom: i think if we had had more time, one thing i would have really loved to go into more was the glimmer api. i see the glimmer api for components being the future of how you do components in ember, and we have focused really hard on making these things feel like it's just regular javascript.

like i was saying, when ember came of age, we didn’t have the es6 classes. we couldn't even use es5 because it wasn't adopted enough. so we built our own object model on top.

then rapidly, all of a sudden, the pace of javascript's development picked up, and now we have classes, and we have decorators, and we have getters, and we have all these amazing new things. because it happened right after we stabilized our api, people look at ember sometimes and think, you know, that feels like they're doing their own weird thing and already know javascript. it's like "i don’t wanna do it the ember way. i wanna do it the javascript way."

so what we tried really, really hard to do with glimmer is say, "okay, let's think about what someone who only knows javascript or modern javascript, what do they know and what are they expecting?" and let's just make the whole thing feel easy and natural for them.

so for example, glimmer component when you define it is just an es6 class that extends the glimmer component base class. the way that you import the glimmer component is a standard import. then there's a proposal in javascript called "decorators," which i believe is stage two. that lets you add certain annotations to properties, and methods, and classes and so on.

now in glimmer we have introduced something called "track properties", but more importantly in glimmer, you don’t actually need any kind of annotation because your computed properties are just getters, which is built in the language. of course, if you want to do change tracking like "hey, this computed property changed, how do i update the dom?" you have a very simple decorator. so you don’t have to have this weird ember thing, you just do what's in the language.

jonan: which is hopefully going to increase adoption.

tom: i hope so, yeah.

jonan: this is a common problem, not just in the javascript community. you're coming up with new frameworks and you're moving very quickly. javascript, in particular, is moving very quickly. it seems like every week, or month, there's some new tool that i would have to learn, right?

tom: yeah.

jonan: something new and each one of them has their own distinct syntax, constantly changing. if you keep moving the goal post, eventually people tire of it. i consider the approach you took with glimmer to be a very mature approach, and i really appreciate the effort you put in to make that.

tom: i think when people see glimmer, it's very easy for their reaction to be "oh, god, here comes another javascript library." what i hope is that people can look at our track record, and i hope we have some credibility with people, and see that, "hey, we're not just talking a big game here. we actually have a community that has gone back at least five years. and we have apps that are five years old that have migrated."

so i just hope people can feel safe when they look at glimmer. it checks all the checklists that you need to check in 2017, but it also comes with the same community and the same core team that really values stability, that values migration, that values convention.

jonan: and speed.

tom: yeah, and speed.

jonan: i think speed is the real reward from glimmer. you build something in glimmer and you, somehow, have accomplished this impossible tradeoff where you have a fast render speed and a fast update speed.

tom: i think it's interesting too because, you know, this always happens with benchmarks. there's some suite of benchmarks that comes out, people become over-focused on one particular metric.

jonan: right.

tom: in this case, the community, has really focused on, in the last year, initial render performance. initial render performance is super, super important, but it's not always worth sacrificing updating performance. i think glimmer has hit this really nice sweet spot where it’s not as absolutely fast as the fastest rendering library, in terms of initial rendering, but it blows away all the other rendering engines at updates.

being the absolute fastest at initial render is only so important, so long as the user notices. it's not worth sacrificing everything if your constant time is imperceptible to the human, and i'm really excited with that sweet spot that we've hit.

jonan: we were talking the other day at lunch about the fact that there are some pages where i really don’t mind a long load time. if i'm going to a dashboard for a product that i've already purchased, i'm gonna sit there and wait. like, yeah, maybe it takes 10 seconds, right, and i'm gonna be super annoyed and think, "wow, why am i paying these people money?" right? but for some definition of fast, all things start to be equal, when we get down towards those lower numbers.

tom: that’s right and i think people conflate those. you know, it's easy to get in a twitter flame war because i'm talking about my dashboard that people are gonna sit on all day. you're talking about this ecommerce site. if you don’t have a response in under 200 milliseconds, people are gonna bounce and you're not gonna make your money. so those are different categories.

that being said, i really do believe in my heart that there is a future where you can build your big dashboard app and it doesn’t take forever to load if we make the tools really good.

jonan: thank you so much for taking the time to talk to us today. i really appreciate it. do you have anything else you wanna share? last minute thoughts?

tom: oh, i just cannot wait to take a vacation in barbados for a week.

jonan: tom, thank you so much for being here.

tom: thank you, jonan, and thank you, terence.

terence: thank you.


Information

today we are proud to announce that heroku ci, a low-configuration test runner for unit and browser testing that is tightly integrated with heroku pipelines, is now in general availability.

tests@2x

to build software with optimal feature release speed and quality, continuous integration (ci) is a popular and best practice, and is an essential part of a complete continuous delivery (cd) practice. as we have done for builds, deployments, and cd, heroku ci dramatically improves the ease, experience, and function of ci. now your energy can go into your apps, not your process.

with today's addition of heroku ci, heroku now offers a complete ci/cd solution for developers in all of our officially supported languages: node, ruby, java, python, go, ruby, scala, php, and clojure. as you would expect from heroku, heroku ci is simple, powerful, visual, and prescriptive. it is intended to provide the features and flexibility to be the complete ci solution for the vast majority of application development situations, serving use cases that range from small innovation teams, to large enterprise projects.

easy to setup and use

setup@2x

configuration of heroku ci is quite low (or none). there is no it involved; heroku ci is automatically available and coordinated for all apps in heroku pipelines. just turn on heroku ci for the pipeline, and each push to github will run your tests. tests reside in the location that is the norm typical for each supported language, for example: test scripts in go typically reside in the file named "function_test.go". these tests are executed automatically on each git push. so no learning curve is involved, and little reconfiguration is typically necessary when migrating to heroku ci from jenkins and other ci systems.

for users who are also new to continuous delivery, we've made heroku pipelines set-up easier than ever with a straightforward 3-step setup that automatically creates and configures your review, development, staging, and production apps. all that's left is to click the "tests" tab and turn on heroku ci.

visual at every stage

pipeline@2x

from setup, to running tests, to ci management, everything about heroku ci is intended to be fully visual and intuitive -- even for users who are new to continuous integration. for each app, the status of the latest or currently running test run is shown clearly on the pipelines page. test actions are a click away, and fully available via the ui: re-run any test, run new tests against an arbitrary branch, search previous tests by branch or pull request, and see full detail for any previous test. and heroku ci integrates seamlessly with github - on every git push your tests run, allowing you to also see the test result within github web or github desktop interfaces.

ci users who want more granular control, direct debug access, and programmatic control of ci actions can use the cli interface for heroku ci.

power, speed, and flexibility

for every test you run, heroku ci creates and populates an ephemeral app environment that mirrors your staging and production environments. these ci apps are created automatically, and then destroyed immediately after test runs complete. all the add-ons, databases, and configurations your code requires are optimized for test speed, and parity with downstream environments. over the beta period, we have been working with add-on partners to make sure the ci experience is fast and seamless.

setup and tear-down for each ci run happens in seconds. because we use these ephemeral heroku apps to run your tests, there is no queue time (as is common with many ci systems). your tests run immediately, every time on dedicated performance dynos.

across the thousands of participants in our public beta, most developers observed test runs completing significantly faster than expectations.

cost-effective

we view ci as an essential part of effective development workflows, that is, part of good overall delivery process.

each ci-enabled heroku pipeline is charged just $10/month for an unlimited number of test runs. for each test run, dyno charges apply only for the duration of tests. we recommend and default to performance-m dynos to power test runs, and you can specify other dyno sizes.

note that all charges are pro-rated per second, with no commitment, so you can try out heroku ci for pennies -- usually with little modification to your existing test scripts.

enterprise-ready

all heroku enterprise customers get unlimited ci-enabled pipelines, and an unlimited number of test runs, all, of course, with zero queue time. no provisioning, authentication set-up, or management of ci is required for new projects, and heroku ci can be turned on for any heroku pipeline with a single click.

existing heroku enterprise dyno credits are automatically used for test runs, and invoices will contain a new section listing the ci-enabled pipelines alongside the account-wide dyno usage for ci test runs.

all test run results are available at permanent urls that can be referenced for compliance regimes, and all authentication is managed under existing heroku enterprise teams (org) security. unification of security, authentication, billing between ci and production deployments, along with a prescriptive methodology across company projects, lets enterprises innovate on heroku with the agility of a start-up.

heroku-built, community-hardened

some terms are not usually associated with ci systems: we think heroku ci is among the most pleasant, beautiful software testing systems available -- and we have you to thank for this. more than 1500 beta users tested heroku ci, surfacing bugs, offering suggestions; telling us that some webhooks got dropped, that an icon on the tab might be nice, that it should be more obvious how to re-run a test ... and roughly 600 other notes, many of which grew into e-mail conversations with you. as is the case with all software: we will still be perfecting. and we are pretty proud of what we have here. thank you, and keep the comments coming!

get started

it's easy. set-up a heroku pipeline and you're ready. there's even a two-minute video here and a simple how-to. give it a spin, and let us know what you think.


Information

today we are proud to announce that heroku ci, a low-configuration test runner for unit and browser testing that is tightly integrated with heroku pipelines, is now in general availability.

tests@2x

to build software with optimal feature release speed and quality, continuous integration (ci) is a popular and best practice, and is an essential part of a complete continuous delivery (cd) practice. as we have done for builds, deployments, and cd, heroku ci dramatically improves the ease, experience, and function of ci. now your energy can go into your apps, not your process.

with today's addition of heroku ci, heroku now offers a complete ci/cd solution for developers in all of our officially supported languages: node, ruby, java, python, go, scala, php, and clojure. as you would expect from heroku, heroku ci is simple, powerful, visual, and prescriptive. it is intended to provide the features and flexibility to be the complete ci solution for the vast majority of application development situations, serving use cases that range from small innovation teams, to large enterprise projects.

easy to setup and use

setup@2x

configuration of heroku ci is quite low (or none). there is no it involved; heroku ci is automatically available and coordinated for all apps in heroku pipelines. just turn on heroku ci for the pipeline, and each push to github will run your tests. tests reside in the location that is the norm typical for each supported language, for example: test scripts in go typically reside in the file named "function_test.go". these tests are executed automatically on each git push. so no learning curve is involved, and little reconfiguration is typically necessary when migrating to heroku ci from jenkins and other ci systems.

for users who are also new to continuous delivery, we've made heroku pipelines set-up easier than ever with a straightforward 3-step setup that automatically creates and configures your review, development, staging, and production apps. all that's left is to click the "tests" tab and turn on heroku ci.

visual at every stage

pipeline@2x

from setup, to running tests, to ci management, everything about heroku ci is intended to be fully visual and intuitive -- even for users who are new to continuous integration. for each app, the status of the latest or currently running test run is shown clearly on the pipelines page. test actions are a click away, and fully available via the ui: re-run any test, run new tests against an arbitrary branch, search previous tests by branch or pull request, and see full detail for any previous test. and heroku ci integrates seamlessly with github - on every git push your tests run, allowing you to also see the test result within github web or github desktop interfaces.

ci users who want more granular control, direct debug access, and programmatic control of ci actions can use the cli interface for heroku ci.

power, speed, and flexibility

for every test you run, heroku ci creates and populates an ephemeral app environment that mirrors your staging and production environments. these ci apps are created automatically, and then destroyed immediately after test runs complete. all the add-ons, databases, and configurations your code requires are optimized for test speed, and parity with downstream environments. over the beta period, we have been working with add-on partners to make sure the ci experience is fast and seamless.

setup and tear-down for each ci run happens in seconds. because we use these ephemeral heroku apps to run your tests, there is no queue time (as is common with many ci systems). your tests run immediately, every time on dedicated performance dynos.

across the thousands of participants in our public beta, most developers observed test runs completing significantly faster than expectations.

cost-effective

we view ci as an essential part of effective development workflows, that is, part of good overall delivery process.

each ci-enabled heroku pipeline is charged just $10/month for an unlimited number of test runs. for each test run, dyno charges apply only for the duration of tests. we recommend and default to performance-m dynos to power test runs, and you can specify other dyno sizes.

note that all charges are pro-rated per second, with no commitment, so you can try out heroku ci for pennies -- usually with little modification to your existing test scripts.

enterprise-ready

all heroku enterprise customers get unlimited ci-enabled pipelines, and an unlimited number of test runs, all, of course, with zero queue time. no provisioning, authentication set-up, or management of ci is required for new projects, and heroku ci can be turned on for any heroku pipeline with a single click.

existing heroku enterprise dyno credits are automatically used for test runs, and invoices will contain a new section listing the ci-enabled pipelines alongside the account-wide dyno usage for ci test runs.

all test run results are available at permanent urls that can be referenced for compliance regimes, and all authentication is managed under existing heroku enterprise teams (org) security. unification of security, authentication, billing between ci and production deployments, along with a prescriptive methodology across company projects, lets enterprises innovate on heroku with the agility of a start-up.

heroku-built, community-hardened

some terms are not usually associated with ci systems: we think heroku ci is among the most pleasant, beautiful software testing systems available -- and we have you to thank for this. more than 1500 beta users tested heroku ci, surfacing bugs, offering suggestions; telling us that some webhooks got dropped, that an icon on the tab might be nice, that it should be more obvious how to re-run a test ... and roughly 600 other notes, many of which grew into e-mail conversations with you. as is the case with all software: we will still be perfecting. and we are pretty proud of what we have here. thank you, and keep the comments coming!

get started

it's easy. set-up a heroku pipeline and you're ready. there's even a two-minute video here and a simple how-to. give it a spin, and let us know what you think.


Information

heroku has always made it easy for you to extend your apps with add-ons. starting today, partners can access the platform api to build a more secure and cohesive developer experience between add-ons and heroku.

advancing the add-on user experience

several add-ons are already using the new platform api for partners. adept scale, a long-time add-on in our marketplace that provides automated scaling of heroku dynos, has updated its integration to offer a stronger security stance, with properly scoped access to each app it is added to. existing customer integrations have been updated as of friday may 12th. all new installs of adept scale will use the more secure, scoped platform api.

opbeat, a performance monitoring service for node.js developers, is using the platform api in production to sync their user roles to match heroku. it is also synchronizing metadata, so that its data stays in sync with heroku when users make changes, for instance renaming a heroku app. this connection enables a more cohesive experience between the two tools.

we have a list of standard endpoints that partners can use documented in the dev center, with more functionality coming soon. for new integrations that may require additional endpoints, we ask partners to reach out to us directly about making specific endpoints from the platform api available. please contact us with information about your intended integration.

as add-on partner adoption of the platform api grows, heroku customers can expect to see a more cohesive, reliable and secure developer experience when using add-ons, and a wider range of add-on offerings in our elements marketplace.


Information

heroku has always made it easy for you to extend your apps with add-ons. starting today, partners can access the platform api to build a more secure and cohesive developer experience between add-ons and heroku.

advancing the add-on user experience

several add-ons are already using the new platform api for partners. adept scale, a long-time add-on in our marketplace that provides automated scaling of heroku dynos, has updated its integration to offer a stronger security stance, with properly scoped access to each app it is added to. existing customer integrations have been updated as of friday may 12th. all new installs of adept scale will use the more secure, scoped platform api.

opbeat, a performance monitoring service for node.js developers, is using the platform api in production to sync their user roles to match heroku. it is also synchronizing metadata, so that its data stays in sync with heroku when users make changes, for instance renaming a heroku app. this connection enables a more cohesive experience between the two tools.

we have a list of standard endpoints that partners can use documented in the dev center, with more functionality coming soon. for new integrations that may require additional endpoints, we ask partners to reach out to us directly about making specific endpoints from the platform api available. please contact us with information about your intended integration.

as add-on partner adoption of the platform api grows, heroku customers can expect to see a more cohesive, reliable and secure developer experience when using add-ons, and a wider range of add-on offerings in our elements marketplace.


Information

today, we are excited to announce dns service discovery for heroku private spaces, an easy way to find and coordinate services for microservice-style deployments.

as applications grow in sophistication and scale, developers often organize their applications into small, purpose-built “microservices”. these microservice systems act in unison to achieve what otherwise would be handled by a single, larger monolithic application, which serves the benefit of simplifying applications’ codebases and improving their overall reliability.

dns service discovery is a valuable component of a true microservices architecture. it is a simple, yet effective way to facilitate microservice-style application architecture on private spaces using standard dns naming conventions. as a result, your applications can now know in advance how they should reach the other process types and services needed to do their job.

how it works

dns-discovery-blog

dns service discovery allows you to connect these services together by providing a naming scheme for finding individual dynos within your private space. every process type for every application in the space is configured to respond to a standard dns name of the format <process-type>.<application-name>.app.localspace.

example:

$ nslookup web.myapp.app.localspace web.myapp.app.localspace. 0 in a 10.10.10.11 web.myapp.app.localspace. 0 in a 10.10.10.10 web.myapp.app.localspace. 0 in a 10.10.10.9 

this is enabled by default on all newly created applications in private spaces. for existing private spaces applications, you need to run:

$ heroku features:enable spaces-dns-discovery --app <app name> 

when combined with heroku flow’s continuous delivery approach, the benefits of a microservices architecture are further realized. for example, in a distributed system, each application can have a smaller footprint and a more focused purpose - so when it comes time to push updates to this system, your team can modify and continuously deliver a single portion of your architecture, instead of having to cycle out the entirety of your application. and when your application’s traffic grows, you can scale up the just the portion of your system that requires extra cycles, resulting in a more flexible and economical use of resources.

learn more

we’re excited to see the new possibilities service discovery opens up for microservices architectures. if you are interested in learning more about dns service discovery for your applications in private spaces, please check out our dev center article or contact us with further questions.


Information

today we are happy to announce heroku shield, a new addition to our heroku enterprise line of products. heroku shield introduces new capabilities to dynos, postgres databases and private spaces that make heroku suitable for high compliance environments such as healthcare apps regulated by the health insurance portability and accountability act (hipaa). with heroku shield, the power and productivity of heroku is now easily available to a whole new class of strictly regulated apps.

at the core of heroku’s products is the idea that developers can turn great ideas into successful customer experiences at a surprising pace when all unnecessary and irrelevant elements of application infrastructure are systematically abstracted away. the design of heroku shield started with the question: what if regulatory and compliance complexity could be transformed into a simple developer experience, just as has been done for infrastructure complexity? the outcome is a simple, elegant user experience that abstracts away compliance complexity while freeing development teams to use the tools and services they love in a new class of app.

heroku shield is generally available to heroku enterprise customers. for more information about heroku enterprise, please contact us here.

how it works

shield-private-space-blog

shield private spaces

to use heroku shield, start by creating a new private space and switch on the shield option. the first thing you notice is that logging is now configured at the space level. with private space logging, logs from all apps and control systems are automatically forwarded to the logging destination configured for the space. this greatly simplifies compliance auditing while still leaving the developers in full control of app configuration and deployment.

shield private spaces also adds a critical compliance feature to the heroku run command used by developers to access production apps for administrative and diagnostic tasks. in a shield private space, all keystrokes typed in an interactive heroku run session are logged automatically. this meets a critical compliance requirement to audit all production access but without restricting developers from doing diagnostics and time sensitive remediation tasks directly on production environments.

shield private dynos and postgres

in a shield private space you can create special shield flavors of dynos and postgres databases. the shield private dyno includes an encrypted ephemeral file system and restricts ssl termination from using tls 1.0 which is considered vulnerable. shield private postgres further guarantees that data is always encrypted in transit and at rest. heroku also captures a high volume of security monitoring events for shield dynos and databases which helps meet regulatory requirements without imposing any extra burden on developers.

app innovation for healthcare and beyond

with heroku shield, you can now build healthcare apps on heroku that handle protected health information (phi) in compliance with the united states hipaa framework. the healthcare industry is living proof of how challenging it is to modernize application delivery while meeting strict compliance requirements. all you have to do is compare the user experience of most healthcare apps with what you have come to expect from apps in less regulated industries like e-commerce, productivity and social networks.

it's simply too hard to evolve and modernize healthcare apps today because they are delivered using outdated, rigid platforms and practices. at heroku, we are doing our small part to change this by providing development teams a hipaa-ready platform with the industry's best continuous delivery experience.

of course, this is just a step on our trust journey - the work of providing more security and compliance capabilities is never complete. we are already working on new capabilities and certifications for heroku shield, and as always look to our customers and the developer community for input on how to direct and prioritize those efforts.

summary

the opportunity to combine developer creativity with the opportunities for innovation in high compliance industries is powerful and potent. heroku has had the privilege to see the possibilities that result from removing obstacles from developers, and with shield, hope to see that promise amplified yet again. for more information on shield, see the dev center article here, or contact heroku.


Information

you’re using a continuous delivery pipeline because it takes the manual steps out of code deployment. but when a release includes updates to a database schema, the deployment requires manual intervention and team coordination. typically, someone on the team will log into the database and run the migration, then quickly deploy the new code to production. it's a process rife with deployment risk.

now with release phase, generally available today, you can define tasks you need to run before a release is deployed to production. simply push your code and release phase will automatically run your database schema migration, upload static assets to a cdn, or any other task your app needs to be ready for production. if a release phase task fails, the new release is not deployed, leaving the production release unaffected.

to get started, view the release phase documentation.

release-phase-diagram-3

a release phase example

let’s say you have a node.js app, using sequelize as your orm, and want to run a database migration on your next release. simply define a release command in your procfile:

release: node_modules/.bin/sequelize db:migrate web: node ./bin/www 

when you run git push heroku master, after the build is successful, release phase begins the migration via a one-off dyno. if the migration is successful, the app code is deployed to production. if the migration fails, your release is not deployed and you can check your release phase logs to debug.

 $ git push heroku master ... running release command…. --- migrating db --- sequelize [node: 7.9.0, cli: 2.7.9, orm: 3.30.4] loaded configuration file "config/config.json". using environment "production". == 20170413204504-create-post: migrating ====== == 20170413204504-create-post: migrated (0.054s) v23 successfully deployed 

check out the video to watch it in action:

heroku flow + release phase

heroku flow provides you with a professional continuous delivery pipeline with dev, staging, and production environments. when you promote a release from staging to production, release phase will automatically run your tasks in the production environment.

screen shot 2017-05-09 at 10

with heroku flow you always knows where a particular feature is on the path to production. now -- with release phase -- the path to production has even fewer manual steps.


Information

it’s been a little over a year since our last happy node hackers post, and even in such a short time much has changed and some powerful new tools have been released. the node.js ecosystem continues to mature and new best practices have emerged.

here are 8 habits for happy node hackers updated for 2017.

1. lock down your dependency tree

in modern node applications, your code is often only the tip of an iceberg. even a small application could have thousands of lines of javascript hidden in node_modules. if your application specifies exact dependencies in package.json, the libraries you depend on probably don’t. over time, you'll get slightly different code for each install, leading to unpredictability and potentially introducing bugs.

in the past year facebook surprised the node world when it announced yarn, a new package manager that let you use npm's vast registry of nearly half a million modules and featured a lockfile that saves the exact version of every module in your dependency tree. this means that you can be confident that the exact same code will be downloaded every time you deploy your application.

not to be outdone, npm released a new version with a lockfile of its own. oh, and it's a lot faster now too. this means that whichever modern package manager you choose, you'll see a big improvement in install times and fewer errors in production.

to get started with yarn, install it globally with npm install -g yarn and run yarn in your application’s directory. this will install your dependencies and generate a yarn.lock file which tells heroku to use yarn when building your application.

to use npm 5, update locally by running npm install -g [email protected] and reinstall your application's dependencies by running rm -rf node_modules && npm install. the generated package-lock.json will let heroku know to use npm 5 to install your modules.

2. hook things up

lifecycle scripts make great hooks for automation. if you need to run something before building your app, you can use the preinstall script. need to build assets with grunt, gulp, browserify, or webpack? do it in the postinstall script.

in package.json:

"scripts": { "postinstall": "grunt build", "start": "node app.js" } 

you can also use environment variables to control these scripts:

"postinstall": "if $build_assets; then npm run build-assets; fi", "build-assets": "grunt build" 

if your scripts start getting out of control, move them to files:

"postinstall": "scripts/postinstall.sh" 

3. modernize your javascript

with the release of node 8, the days of maintaining a complicated build system to write our application in es2015, also known as es6, are mostly behind us. node is now 99% feature complete with the es2015 spec, which means you can use new features such as template literals or destructuring assignment with no ceremony or build process!

const combinations = [ { number: "8.0.0", platform: "linux-x64" }, { number: "8.0.0", platform: "darwin-x64" }, { number: "7.9.0", platform: "linux-x64" }, { number: "7.9.0", platform: "darwin-x64" } ]; for (let { number, platform } of combinations) { console.log(`node-v${number}-${platform}.tar.gz`); } 

there are a ton of additions, and overall they work together to significantly increase the legibility of javascript and make your code more expressive.

4. keep your promises

beyond es2015, node 8 supports the long-awaited async and await keywords without opting in to experimental features. this feature builds on top of promises allowing you to write asynchronous code that looks like synchronous code and has the same error handling semantics, making it easier to write, easier to understand, and safer.

you can re-write nested callback code that looks like this:

function getphotos(fn) { getusers((err, users) => { if (err) return fn(err); getalbums(users, (err, albums) => { if (err) return fn(err); getphotosforalbums(albums, (err, photos) => { if (err) return fn(err); fn(null, photos); }); }); }); } 

into code that reads top-down instead of inside-out:

async function getphotos() { const users = await getusers(); const albums = await getalbums(users); return getphotosforalbums(albums); } 

you can call await on any call that returns a promise. if you have functions that still expect callbacks, node 8 ships with util.promisify which can automatically turn a function written in the callback style into a function that can be used with await.

5. automate your code formatting with prettier

we’ve all collectively spent too much time formatting code, adding a space here, aligning a comment there, and we all do it slightly different than our teammate two desks down. this leads to endless debates about where the semicolon goes or whether we should use semicolons at all. prettier is an open source tool that promises to finally eliminate those pointless arguments for good. you can write your code in any style you like, and with one command it’s all formatted consistently.

prettier

that may sound like a small thing but freeing yourself from arranging whitespace quickly feels liberating. prettier was only released a few months ago, but it's already been adopted by babel, react, khan academy, bloomberg, and more!

if you hate writing semicolons, let prettier add them for you, or your whole team can banish them forever with the --no-semi option. prettier supports es2015 and flow syntax, and the recent 1.4.0 release added support for css and typescript as well.

there are integrations with all major text editors, but we recommend setting it up as a pre-commit hook or with a lifecycle script in package.json.

"scripts": { "prettify": "prettier --write 'src/**/*.js'" } 

6. test continuously

pushing out a new feature and finding out that you've broken the production application is a terrible feeling. you can avoid this mistake if you’re diligent about writing tests for the code you write, but it can take a lot of time to write a good test suite. besides, that feature needs to be shipped yesterday, and this is only a first version. why write tests that will only have to be re-written next week?

writing unit tests in a framework like mocha or jest is one of the best ways of making sure that your javascript code is robust and well-designed. however there is a lot of code that may not justify the time investment of an extensive test suite. the testing library jest has a feature called snapshot testing that can help you get insight and visibility into code that would otherwise go untested. instead of deciding ahead of time what the expected output of a function call should be and writing a test around it, jest will save the actual output into a local file on the first run, and then compare it to the response on the next run and alert you if it's changed.

jest-snapshot-testing

while this won't tell you if your code is working exactly as you'd planned when you wrote it, this does allow you to observe what changes you're actually introducing into your application as you move quickly and develop new features. when the output changes you can quickly update the snapshots with a command, and they will be checked into your git history along with your code.

it("test /endpoint", async () => { const res = await request(`http://0.0.0.0:5000/endpoint`); const body = await res.json(); const { status, headers } = res; expect({ status, body, headers }).tomatchsnapshot(); }); 

example repo

once you've tested your code, setting up a good ci workflow is one way of making sure that it stays tested. to that end, we launched heroku ci. it’s built into the heroku continuous delivery workflow, and you'll never wait for a queue. check it out!

don't need the fancy features and just want a super simple test runner? check out tape for your minimal testing needs.

7. wear your helmet

for web application security, a lot of the important yet easy configuration to lock down a given app can be done by returning the right http headers.

you won't get most of these headers with a default express application, so if you want to put an application in production with express, you can go pretty far by using helmet. helmet is an express middleware module for securing your app mainly via http headers.

helmet helps you prevent cross-site scripting attacks, protect against click-jacking, and more! it takes just a few lines to add basic security to an existing express application:

const express = require('express'); const helmet = require('helmet'); const app = express(); app.use(helmet()); 

read more about helmet and other express security best practices

8. https all the things

by using private connections by default, we make it the norm, and everyone is safer. as web engineers, there is no reason we shouldn’t default all traffic in our applications to using https.

in an express application, there are several things you need to do to make sure you're serving your site over https. first, make sure the strict-transport-security header (often abbreviated as hsts) is set on the response. this instructs the browser to always send requests over https. if you’re using helmet, then this is already done for you!

then make sure that you're redirecting any http requests that do make it to the server to the same url over https. the express-enforce-ssl middleware provides an easy way to do this.

const express = require('express'); const expressenforcesssl = require('express-enforces-ssl'); const app = express(); app.enable('trust proxy'); app.use(expressenforcesssl()); 

additionally you'll need a tls certificate from a certificate authority. but if you are deploying your application to heroku and using any hobby or professional dyno, you will automatically get tls certificates set up through let’s encrypt for your custom domains by our automated certificate management – and for applications without a custom domain, we provide a wildcard certificate for *.herokuapp.com.

what are your habits?

i try to follow these habits in all of my projects. whether you’re new to node or a server-side js veteran, i’m sure you’ve developed tricks of your own. we’d love to hear them! share your habits by tweeting with the #node_habits hashtag.

happy hacking!


Information

it’s been a little over a year since our last happy node hackers post, and even in such a short time much has changed and some powerful new tools have been released. the node.js ecosystem continues to mature and new best practices have emerged.

here are 8 habits for happy node hackers updated for 2017. they're specifically for app developers, rather than module authors, since those groups have different goals and constraints:

1. lock down your dependency tree

in modern node applications, your code is often only the tip of an iceberg. even a small application could have thousands of lines of javascript hidden in node_modules. if your application specifies exact dependencies in package.json, the libraries you depend on probably don’t. over time, you'll get slightly different code for each install, leading to unpredictability and potentially introducing bugs.

in the past year facebook surprised the node world when it announced yarn, a new package manager that let you use npm's vast registry of nearly half a million modules and featured a lockfile that saves the exact version of every module in your dependency tree. this means that you can be confident that the exact same code will be downloaded every time you deploy your application.

not to be outdone, npm released a new version with a lockfile of its own. oh, and it's a lot faster now too. this means that whichever modern package manager you choose, you'll see a big improvement in install times and fewer errors in production.

to get started with yarn, install it and run yarn in your application’s directory. this will install your dependencies and generate a yarn.lock file which tells heroku to use yarn when building your application.

to use npm 5, update locally by running npm install -g [email protected] and reinstall your application's dependencies by running rm -rf node_modules && npm install. the generated package-lock.json will let heroku know to use npm 5 to install your modules.

2. hook things up

lifecycle scripts make great hooks for automation. if you need to run something before building your app, you can use the preinstall script. need to build assets with grunt, gulp, browserify, or webpack? do it in the postinstall script.

in package.json:

"scripts": { "postinstall": "grunt build", "start": "node app.js" } 

you can also use environment variables to control these scripts:

"postinstall": "if $build_assets; then npm run build-assets; fi", "build-assets": "grunt build" 

if your scripts start getting out of control, move them to files:

"postinstall": "scripts/postinstall.sh" 

3. modernize your javascript

with the release of node 8, the days of maintaining a complicated build system to write our application in es2015, also known as es6, are mostly behind us. node is now 99% feature complete with the es2015 spec, which means you can use new features such as template literals or destructuring assignment with no ceremony or build process!

const combinations = [ { number: "8.0.0", platform: "linux-x64" }, { number: "8.0.0", platform: "darwin-x64" }, { number: "7.9.0", platform: "linux-x64" }, { number: "7.9.0", platform: "darwin-x64" } ]; for (let { number, platform } of combinations) { console.log(`node-v${number}-${platform}.tar.gz`); } 

there are a ton of additions, and overall they work together to significantly increase the legibility of javascript and make your code more expressive.

4. keep your promises

beyond es2015, node 8 supports the long-awaited async and await keywords without opting in to experimental features. this feature builds on top of promises allowing you to write asynchronous code that looks like synchronous code and has the same error handling semantics, making it easier to write, easier to understand, and safer.

you can re-write nested callback code that looks like this:

function getphotos(fn) { getusers((err, users) => { if (err) return fn(err); getalbums(users, (err, albums) => { if (err) return fn(err); getphotosforalbums(albums, (err, photos) => { if (err) return fn(err); fn(null, photos); }); }); }); } 

into code that reads top-down instead of inside-out:

async function getphotos() { const users = await getusers(); const albums = await getalbums(users); return getphotosforalbums(albums); } 

you can call await on any call that returns a promise. if you have functions that still expect callbacks, node 8 ships with util.promisify which can automatically turn a function written in the callback style into a function that can be used with await.

5. automate your code formatting with prettier

we’ve all collectively spent too much time formatting code, adding a space here, aligning a comment there, and we all do it slightly different than our teammate two desks down. this leads to endless debates about where the semicolon goes or whether we should use semicolons at all. prettier is an open source tool that promises to finally eliminate those pointless arguments for good. you can write your code in any style you like, and with one command it’s all formatted consistently.

prettier

that may sound like a small thing but freeing yourself from arranging whitespace quickly feels liberating. prettier was only released a few months ago, but it's already been adopted by babel, react, khan academy, bloomberg, and more!

if you hate writing semicolons, let prettier add them for you, or your whole team can banish them forever with the --no-semi option. prettier supports es2015 and flow syntax, and the recent 1.4.0 release added support for css and typescript as well.

there are integrations with all major text editors, but we recommend setting it up as a pre-commit hook or with a lifecycle script in package.json.

"scripts": { "prettify": "prettier --write 'src/**/*.js'" } 

6. test continuously

pushing out a new feature and finding out that you've broken the production application is a terrible feeling. you can avoid this mistake if you’re diligent about writing tests for the code you write, but it can take a lot of time to write a good test suite. besides, that feature needs to be shipped yesterday, and this is only a first version. why write tests that will only have to be re-written next week?

writing unit tests in a framework like mocha or jest is one of the best ways of making sure that your javascript code is robust and well-designed. however there is a lot of code that may not justify the time investment of an extensive test suite. the testing library jest has a feature called snapshot testing that can help you get insight and visibility into code that would otherwise go untested. instead of deciding ahead of time what the expected output of a function call should be and writing a test around it, jest will save the actual output into a local file on the first run, and then compare it to the response on the next run and alert you if it's changed.

jest-snapshot-testing

while this won't tell you if your code is working exactly as you'd planned when you wrote it, this does allow you to observe what changes you're actually introducing into your application as you move quickly and develop new features. when the output changes you can quickly update the snapshots with a command, and they will be checked into your git history along with your code.

it("test /endpoint", async () => { const res = await request(`http://0.0.0.0:5000/endpoint`); const body = await res.json(); const { status, headers } = res; expect({ status, body, headers }).tomatchsnapshot(); }); 

example repo

once you've tested your code, setting up a good ci workflow is one way of making sure that it stays tested. to that end, we launched heroku ci. it’s built into the heroku continuous delivery workflow, and you'll never wait for a queue. check it out!

don't need the fancy features and just want a super simple test runner? check out tape for your minimal testing needs.

7. wear your helmet

for web application security, a lot of the important yet easy configuration to lock down a given app can be done by returning the right http headers.

you won't get most of these headers with a default express application, so if you want to put an application in production with express, you can go pretty far by using helmet. helmet is an express middleware module for securing your app mainly via http headers.

helmet helps you prevent cross-site scripting attacks, protect against click-jacking, and more! it takes just a few lines to add basic security to an existing express application:

const express = require('express'); const helmet = require('helmet'); const app = express(); app.use(helmet()); 

read more about helmet and other express security best practices

8. https all the things

by using private connections by default, we make it the norm, and everyone is safer. as web engineers, there is no reason we shouldn’t default all traffic in our applications to using https.

in an express application, there are several things you need to do to make sure you're serving your site over https. first, make sure the strict-transport-security header (often abbreviated as hsts) is set on the response. this instructs the browser to always send requests over https. if you’re using helmet, then this is already done for you!

then make sure that you're redirecting any http requests that do make it to the server to the same url over https. the express-enforce-ssl middleware provides an easy way to do this.

const express = require('express'); const expressenforcesssl = require('express-enforces-ssl'); const app = express(); app.enable('trust proxy'); app.use(expressenforcesssl()); 

additionally you'll need a tls certificate from a certificate authority. but if you are deploying your application to heroku and using any hobby or professional dyno, you will automatically get tls certificates set up through let’s encrypt for your custom domains by our automated certificate management – and for applications without a custom domain, we provide a wildcard certificate for *.herokuapp.com.

what are your habits?

i try to follow these habits in all of my projects. whether you’re new to node or a server-side js veteran, i’m sure you’ve developed tricks of your own. we’d love to hear them! share your habits by tweeting with the #node_habits hashtag.

happy hacking!


Information

it’s rare when a highly structured language with fairly strict syntax sparks emotions of joy and delight. but kotlin, which is statically typed and compiled like other less friendly languages, delivers a developer experience that thousands of mobile and web programmers are falling in love with.

the designers of kotlin, who have years of experience with developer tooling (intellij and other ides), created a language with very specific developer-oriented requirements. they wanted a modern syntax, fast compile times, and advanced concurrency constructs while taking advantage of the robust performance and reliability of the jvm. the result, kotlin 1.0, was released in february 2016 and its trajectory since then has been remarkable. google recently announced official support for kotlin on android, and many server-side technologies have introduced kotlin as a feature.

the spring community announced support for kotlin in spring framework 5.0 last month and the vert.x web server has worked with kotlin for over a year. kotlin integrates with most existing web applications and frameworks out-of-the-box because it's fully interoperable with java, making it easy to use your favorite libraries and tools.

but ultimately, kotlin is winning developers over because it’s a great language. let’s take a look at why it makes us so happy.

a quick look at kotlin

the first thing you’ll notice about kotlin is how streamlined it is compared to java. its syntax borrows from languages like groovy and scala, which reduce boilerplate by making semicolons optional as statement terminators, simplifying for loops, and adding support for string templating among other things. a simple example in kotlin is adding two numbers inside of a string like this:

val sum: string = "sum of $a and $b is ${a + b}" 

the val keyword is a feature borrowed from scala. it defines an immutable variable, which in this case is explicitly typed as a string. but kotlin can also infer that type. for example, you could write:

val x = 5 

in this case, the type int is inferred by the compiler. that’s not to say the type is dynamic though. kotlin is statically typed, but it uses type inference to reduce boilerplate.

like many of the jvm languages it borrows from, kotlin makes it easier to use functions and lambdas. for example, you can filter a list by passing it an anonymous function as a predicate:

val positives = list.filter { it > 0 } 

the it variable in the function body references the first argument to the function by convention. this is borrowed from groovy, and eliminates the boilerplate of defining parameters.

you can also define named functions with the fun keyword. the following example creates a function with default arguments, another great kotlin feature that cleans up your code:

fun printname(name: string = "john doe") { println(name); } 

but kotlin does more than borrow from other languages. it introduces new capabilities that other jvm languages lack. most notable are null safety and coroutines.

null safety means that a kotlin variable cannot be set to null unless it is explicitly defined as a nullable variable. for example, the following code would generate a compiler error:

val message: string = null 

but if you add a ? to the type, it becomes nullable. thus, the following code is valid to the compiler:

val message: string? = null 

null safety is a small but powerful feature that prevents numerous runtime errors in your applications.

coroutines, on the other hand, are more than just syntactic sugar. coroutines are chunks of code that can be suspended to prevent blocking a thread of execution, which greatly simplifies asynchronous programming.

for example, the following program starts 100,000 coroutines using the launch function. the body of the coroutine can be paused at a suspension point so the main thread of execution can perform some other work while it waits:

fun main(args: array<string>) = runblocking<unit> { var number = 0 val random = random() val jobs = list(100_000) { launch(commonpool) { delay(10) number += random.nextint(100) } } jobs.foreach { it.join() } println("the answer is: $number") } 

the suspension point is the delay call. otherwise, the function simply calculates some random number and renders it.

coroutines are still an experimental feature in kotlin 1.1, but early adopters can use them in their applications today.

despite all of these great examples, the most important feature of kotlin is its ability to integrate seamlessly with java. you can mix kotlin code into an application that’s already based on java, and you can consume java apis from kotlin with ease, which smooths the transition and provides a solid foundation.

kotlin sits on the shoulders of giants

behind every successful technology is a strong ecosystem. without the right tools and community, a new programming language will never achieve the uptake required to become a success. that’s why it’s so important that kotlin is built into the java ecosystem rather than outside of it.

kotlin works seamlessly with maven and gradle, which are two of the most reliable and mature build tools in the industry. unlike other programming languages that attempted to separate from the jvm ecosystem by reinventing dependency management, kotlin is leveraging the virtues of java for it's tooling. there are attempts to create kotlin-based build tools, which would be a great addition to the kotlin ecosystem, but they aren't a prerequisite for being productive with the language.

kotlin also works seamlessly with popular jvm web frameworks like spring and vert.x. you can even create a new kotlin-based spring boot application from the spring initializer web app. there has been a huge increase in adoption of kotlin for apps generated this way.

kotlin has great ide support too, thanks to it's creators. the best way to learn kotlin is by pasting some java code into intellij and allowing the ide to convert it to kotlin code for you. all of these pieces come together to make a recipe for success. kotlin is poised to attract both new and old java developers because it's built on solid ground.

if you want to see how well kotlin fits into existing java tooling, try deploying a sample kotlin application on heroku using our getting started with kotlin guide. if you're familiar with heroku, you'll notice that it looks a lot like deploying any other java-based application on our platform, which helps make the learning curve for kotlin relatively flat. but why should you learn kotlin?

why kotlin?

heroku already supports five jvm languages that cover nearly every programming language paradigm in existence. do we need another jvm language? yes. we need kotlin as an alternative to java just as we needed java as an alternative to c twenty years ago. our existing jvm languages are great, but none of them have demonstrated the potential to become the de facto language of choice for a large percentage of jvm developers.

kotlin has learned from the jvm languages that preceded it and borrowed the best parts from those ecosystems. the result is a well round, powerful, and production-ready platform for your apps.


Information

today we are happy to announce heroku shield, a new addition to our heroku enterprise line of products. heroku shield introduces new capabilities to dynos, postgres databases and private spaces that make heroku suitable for high compliance environments such as healthcare apps regulated by the health insurance portability and accountability act (hipaa). with heroku shield, the power and productivity of heroku is now easily available to a whole new class of strictly regulated apps.

at the core of heroku’s products is the idea that developers can turn great ideas into successful customer experiences at a surprising pace when all unnecessary and irrelevant elements of application infrastructure are systematically abstracted away. the design of heroku shield started with the question: what if regulatory and compliance complexity could be transformed into a simple developer experience, just as has been done for infrastructure complexity? the outcome is a simple, elegant user experience that abstracts away compliance complexity while freeing development teams to use the tools and services they love in a new class of app.

heroku shield is generally available to heroku enterprise customers. for more information about heroku enterprise, please contact us here.

how it works

shield-private-space-blog

shield private spaces

to use heroku shield, start by creating a new private space and switch on the shield option. the first thing you notice is that logging is now configured at the space level. with private space logging, logs from all apps and control systems are automatically forwarded to the logging destination configured for the space. this greatly simplifies compliance auditing while still leaving the developers in full control of app configuration and deployment.

shield private spaces also adds a critical compliance feature to the heroku run command used by developers to access production apps for administrative and diagnostic tasks. in a shield private space, all keystrokes typed in an interactive heroku run session are logged automatically. this meets a critical compliance requirement to audit all production access but without restricting developers from doing diagnostics and time sensitive remediation tasks directly on production environments.

shield private dynos and postgres

in a shield private space you can create special shield flavors of dynos and postgres databases. the shield private dyno includes an encrypted ephemeral file system and restricts ssl termination from using tls 1.0 which is considered vulnerable. shield private postgres further guarantees that data is always encrypted in transit and at rest. heroku also captures a high volume of security monitoring events for shield dynos and databases which helps meet regulatory requirements without imposing any extra burden on developers.

app innovation for healthcare and beyond

with heroku shield, you can now build healthcare apps on heroku that are capable of handling protected health information (phi) in compliance with the united states hipaa framework. the healthcare industry is living proof of how challenging it is to modernize application delivery while meeting strict compliance requirements. all you have to do is compare the user experience of most healthcare apps with what you have come to expect from apps in less regulated industries like e-commerce, productivity and social networks.

it's simply too hard to evolve and modernize healthcare apps today because they are delivered using outdated, rigid platforms and practices. at heroku, we are doing our small part to change this by providing development teams a hipaa-ready platform with the industry's best continuous delivery experience.

of course, this is just a step on our trust journey - the work of providing more security and compliance capabilities is never complete. we are already working on new capabilities and certifications for heroku shield, and as always look to our customers and the developer community for input on how to direct and prioritize those efforts.

summary

the opportunity to combine developer creativity with the opportunities for innovation in high compliance industries is powerful and potent. heroku has had the privilege to see the possibilities that result from removing obstacles from developers, and with shield, hope to see that promise amplified yet again. for more information on shield, see the dev center article here, or contact heroku.


Information