Don't Settle for a Sample: What Traditional Approaches to Compensation Benchmarking Get Wrong

Pave Data Lab
October 20, 2023
2
min read

Compensation benchmarks are invaluable tools to help companies set fair and competitive compensation that’s in line with labor markets. Traditionally, compensation benchmarking providers have emphasized sample size as a primary indicator of a benchmark’s reliability: the larger the sample size, the more reliable the compensation benchmark. However, that’s only part of the picture. A comprehensive understanding of sample size, data distribution, and the impact of outliers is crucial for understanding the validity of compensation benchmarks and making well-informed compensation decisions based on them.

The Significance of Sample Size

Sample size – often referred to as the “n” – has long been the go-to metric for assessing data reliability. The historical approach has been that increasing sample size is the only way to increase a data set’s statistical confidence, improve representativeness, and support robust decision-making. Sample size does matter, but by only considering the sample size, users of compensation benchmarking can be tricked into a false sense of reliability of a compensation benchmark. 

The Role of Data Distribution

 

Beyond sample size, how data is distributed within a compensation benchmark is important. Different distribution patterns have profound impacts on the accuracy of compensation benchmarks:

  • Normal Distribution: In cases where salary data follows a normal distribution (i.e. bell curve), a moderate sample size is often sufficient to create a reasonably small confidence interval. Having a small confidence interval – which means that most of your data falls within a predictable range – indicates the data is a good representation of the broader population and therefore trustworthy. When a sample data set – like compensation benchmarks collected from a subset of companies in a location – is normally distributed, it means that the sample mean approximates the broader population mean quite well, even with a modestly-sized sample.
  • Skewed Distribution: However, when data is skewed, with salaries concentrated toward one end of the scale or clumped at multiple points on the scale, a larger sample size is necessary to reduce the margin of error and drive up confidence. Skewed data with a smaller sample set can result in less accurate benchmarks. When data is not normally distributed, a larger sample size is required to increase the overall confidence of the data as being representative of the broader population.

Let’s look at an example. In the image below, Benchmark A and Benchmark B have the same benchmark value for the 50th percentile. They also have the same sample sizes. However, you can see that the data points in Benchmark A actually cluster below the 50th percentile and right above the 50th percentile; if you were to pay at the 50th percentile based on the data for Benchmark A, you would find that a lot of companies are paying way below or above that benchmark for very similar roles in the industry. Pave’s benchmarking confidence labels will tell you that Benchmark A should be used with more caution than Benchmark B.

Balancing Act

To help companies make well-informed compensation decisions, a benchmarking data set must paint a complete picture of both sample size and distribution patterns. This gives users an indication of the overall confidence of a benchmark. 

Modern benchmarking providers understand the interplay between sample size and distribution. They indicate a benchmark as reliable (or not) after considering both the number of data points for the benchmark and the distribution pattern of those data points. 

At Pave, we've always used these confidence scales internally to determine how reliable our data is, and now giving users insight into how we measure data confidence. Within the Pave app, you’ll see a confidence scale – labeling each compensation benchmark from “Very High Confidence” to “Low Confidence” – as well as a sample size for the benchmark. This is to guide users on the holistic confidence in the benchmark.

Conclusion

The most accurate compensation benchmarks must encompass both sample size and data distribution. Ignoring either of these factors can lead to flawed decisions. By balancing these elements, compensation leaders can confidently establish fair and precise compensation benchmarks tailored to their unique organizational needs.

Learn more about Pave’s end-to-end compensation platform
Katie Rovelstad
Operations Leader
Katie is an operations leader at Pave. Prior to joining Pave, Katie held various roles at Segment.

Become a compensation expert with the latest insights powered by Pave.

(function (h, o, t, j, a, r) { h.hj = h.hj || function () { (h.hj.q = h.hj.q || []).push(arguments) }; h._hjSettings = { hjid: 2412860, hjsv: 6 }; a = o.getElementsByTagName('head')[0]; r = o.createElement('script'); r.async = 1; r.src = t + h._hjSettings.hjid + j + h._hjSettings.hjsv; a.appendChild(r); })(window, document, 'https://static.hotjar.com/c/hotjar-', '.js?sv='); !function () { var analytics = window.analytics = window.analytics || []; if (!analytics.initialize) if (analytics.invoked) window.console && console.error && console.error("Segment snippet included twice."); else { analytics.invoked = !0; analytics.methods = ["trackSubmit", "trackClick", "trackLink", "trackForm", "pageview", "identify", "reset", "group", "track", "ready", "alias", "debug", "page", "once", "off", "on", "addSourceMiddleware", "addIntegrationMiddleware", "setAnonymousId", "addDestinationMiddleware"]; analytics.factory = function (e) { return function () { var t = Array.prototype.slice.call(arguments); t.unshift(e); analytics.push(t); return analytics } }; for (var e = 0; e < analytics.methods.length; e++) { var key = analytics.methods[e]; analytics[key] = analytics.factory(key) } analytics.load = function (key, e) { var t = document.createElement("script"); t.type = "text/javascript"; t.async = !0; t.src = "https://cdn.segment.com/analytics.js/v1/" + key + "/analytics.min.js"; var n = document.getElementsByTagName("script")[0]; n.parentNode.insertBefore(t, n); analytics._loadOptions = e }; analytics.SNIPPET_VERSION = "4.13.1"; analytics.load("0KGQyN5tZ344emH53H3kxq9XcOO1bKKw"); analytics.page(); } }(); $(document).ready(function () { $('[data-analytics]').on('click', function (e) { var properties var event = $(this).attr('data-analytics') $.each(this.attributes, function (_, attribute) { if (attribute.name.startsWith('data-property-')) { if (!properties) properties = {} var property = attribute.name.split('data-property-')[1] properties[property] = attribute.value } }) analytics.track(event, properties) }) }); var isMobile = /iPhone|iPad|iPod|Android/i.test(navigator.userAgent); if (isMobile) { var dropdown = document.querySelectorAll('.navbar__dropdown'); for (var i = 0; i < dropdown.length; i++) { dropdown[i].addEventListener('click', function(e) { e.stopPropagation(); this.classList.toggle('w--open'); }); } }