ESQL: Add exponential histogram percentile function #137553

JonasKunz · 2025-11-04T08:27:37Z

Part of #137549.
Adds a new HistogramPercentile scalar function, which extracts a given percentile from a single exponential histogram value.

This functions is not intended to be user-facing, but will only be used as a surrogate to implement the existing PERCENTILE aggregation in combination with an upcoming histogram-merge aggregation. For this reason, this function is undocumented (except for javadoc) and not registered in EsqlFunctionRegistry.
We can eventually expose this function if needed, but in that case we need to go through the design discussions first.

Because this function is not user-facing, this PR does not add CSV tests yet.
CSV tests will follow when the exponential histogram PERCENTILE is implemented via the surrogate explained above.

JonasKunz · 2025-11-04T08:34:37Z

...a/org/elasticsearch/xpack/esql/expression/function/scalar/histogram/HistogramPercentile.java

+    @Evaluator(warnExceptions = ArithmeticException.class)
+    static void process(DoubleBlock.Builder resultBuilder, ExponentialHistogram value, double percentile) {
+        if (percentile < 0.0 || percentile > 100.0) {
+            throw new ArithmeticException("Percentile value must be in the range [0, 100], got: " + percentile);


Question: Is it a good or a bad practice to include the invalid value in the warning message? E.g. Asin doesn't include the wrong value in its warnings.

If the percentile is a constant with the integration, we should verify it and reject invalid values instead. But this should be okay now.

IIUC you mean that if someone writes a query like STATS PERCENTILE(my_histo, 200) the query shouldn't even get to the execution phase, but we should reject it with an error.

Am I understanding this correctly?
I unfortunately don't know what the correct callback / hook would be to perform this verification. I looked in the PERCENTILE aggregation and the POW function for examples, but they don't seem to be doing this kind of verification either.

If you could provide me with an example on how to do this, I can add it in a follow-up.

elasticsearchmachine · 2025-11-04T11:19:24Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

dnhatn

I'm not sure if a separate function is required, but the implementation looks good. Let's merge it and move to the next step. Thanks, Jonas

dnhatn · 2025-11-06T01:33:07Z

...a/org/elasticsearch/xpack/esql/expression/function/scalar/histogram/HistogramPercentile.java

+    @Evaluator(warnExceptions = ArithmeticException.class)
+    static void process(DoubleBlock.Builder resultBuilder, ExponentialHistogram value, double percentile) {
+        if (percentile < 0.0 || percentile > 100.0) {
+            throw new ArithmeticException("Percentile value must be in the range [0, 100], got: " + percentile);


If the percentile is a constant with the integration, we should verify it and reject invalid values instead. But this should be okay now.

.../elasticsearch/xpack/esql/expression/function/scalar/histogram/HistogramPercentileTests.java

JonasKunz · 2025-11-06T07:24:13Z

I'm not sure if a separate function is required

Having this as a separate function helps on two fronts:

DRY: We will only need to implement one aggregation functions for histograms: "merge". Then if we add more analytical aggregations in addition to percentile (e.g. rank), those can all reuse the existing merge aggregation via surrogates instead of copying the logic over and over or doing fancy stuff with inheritance
Performance: If merging and percentile computation was fused in the aggregation and a user queries for e.g. 10 percentiles (p50, p75, p90, p95, ...) we'd do the merging, which is the actual expensive part, 10 times. If instead it is split in merge aggregation + percentile function extraction, the logical planner will to my understanding reuse the merge result across the percentile computations. So we have to do the expensive merge aggregation only once instead of 10 times.

…-json * upstream/main: Mute org.elasticsearch.xpack.inference.action.filter.ShardBulkInferenceActionFilterBasicLicenseIT testLicenseInvalidForInference {p0=false} elastic#137691 Mute org.elasticsearch.xpack.inference.action.filter.ShardBulkInferenceActionFilterBasicLicenseIT testLicenseInvalidForInference {p0=true} elastic#137690 [LTR] Fix feature display order when using explain. (elastic#137671) Remove extra RemoteClusterService instances in unit test (elastic#137647) Fix `ComponentTemplatesFileSettingsIT.testSettingsApplied` (elastic#137669) Consolidates troubleshooting content into the "Returning semantic field embeddings in _source" section (elastic#137233) Update bundled JDK to 25.0.1 (elastic#137640) resolve indices for prefixed _all expressions (elastic#137330) ESQL: Add TopN support for exponential histograms (elastic#137313) allows field caps to be cross project (elastic#137530) ESQL: Add exponential histogram percentile function (elastic#137553) Wait for nodes to have downloaded databases in `GeoIpDownloaderIT` (elastic#137636) Tighten on when THROTTLE decision can be returned (elastic#136794) Mute org.elasticsearch.xpack.esql.qa.single_node.GenerativeMetricsIT test elastic#137655 Add a test for two little known conditional processor paths (elastic#137645) Extract a common ORIGIN constant (elastic#137612) Remove early phase failure in batched (elastic#136889) Returning correct index mode from get data streams api (elastic#137646) [ML] Manage AD results indices (elastic#136065)

JonasKunz added 2 commits November 4, 2025 08:42

Implement histogram percentile scalar function

cb171b9

Add unit tests

4cb99b3

elasticsearchmachine added v9.3.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Nov 4, 2025

JonasKunz added :Analytics/ES|QL AKA ESQL v9.3.0 >non-issue and removed external-contributor Pull request authored by a developer outside the Elasticsearch team v9.3.0 labels Nov 4, 2025

JonasKunz added 2 commits November 4, 2025 09:32

Fix javadoc

c9a0c5c

spotless

1bf6d3a

JonasKunz commented Nov 4, 2025

View reviewed changes

JonasKunz marked this pull request as ready for review November 4, 2025 11:19

JonasKunz requested a review from dnhatn November 4, 2025 11:19

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Nov 4, 2025

dnhatn approved these changes Nov 6, 2025

View reviewed changes

JonasKunz added 2 commits November 6, 2025 08:32

Remove left-over override

44d3436

Merge branch 'main' into esql-percentile-function

6804bb6

JonasKunz enabled auto-merge (squash) November 6, 2025 07:35

JonasKunz merged commit 34e3417 into elastic:main Nov 6, 2025
34 checks passed

JonasKunz deleted the esql-percentile-function branch November 6, 2025 08:53

afoucret pushed a commit to afoucret/elasticsearch that referenced this pull request Nov 6, 2025

ESQL: Add exponential histogram percentile function (elastic#137553)

17fcc3f

Kubik42 pushed a commit to Kubik42/elasticsearch that referenced this pull request Nov 10, 2025

ESQL: Add exponential histogram percentile function (elastic#137553)

088c410

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ESQL: Add exponential histogram percentile function #137553

ESQL: Add exponential histogram percentile function #137553

Uh oh!

JonasKunz commented Nov 4, 2025 •

edited

Loading

Uh oh!

JonasKunz Nov 4, 2025 •

edited

Loading

Uh oh!

dnhatn Nov 6, 2025

Uh oh!

JonasKunz Nov 6, 2025

Uh oh!

elasticsearchmachine commented Nov 4, 2025

Uh oh!

dnhatn left a comment

Uh oh!

dnhatn Nov 6, 2025

Uh oh!

Uh oh!

JonasKunz commented Nov 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ESQL: Add exponential histogram percentile function #137553

ESQL: Add exponential histogram percentile function #137553

Uh oh!

Conversation

JonasKunz commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JonasKunz Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnhatn Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

JonasKunz Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Nov 4, 2025

Uh oh!

dnhatn left a comment

Choose a reason for hiding this comment

Uh oh!

dnhatn Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JonasKunz commented Nov 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JonasKunz commented Nov 4, 2025 •

edited

Loading

JonasKunz Nov 4, 2025 •

edited

Loading