HVAC service quality has historically been measured by callbacks and complaints - lagging indicators that arrive weeks after the problem occurred. measureQuick generates leading indicators on every job. You can see quality issues the same day they happen, before they become callbacks.
The companies that use this data consistently outperform those that do not. The difference is not the tool; it is the discipline of reviewing the numbers and acting on them.
What it measures: How consistently each technician uses measureQuick.
Why it matters: A technician who runs 3 tests in a week with 20 dispatched jobs is skipping measureQuick on 85% of their calls. The tool only works when it is used.
Target: Tests per week should closely match the technician's dispatched job count. A ratio below 80% needs investigation. Some job types (drain clearing, thermostat swap) may not warrant a full test, but any job involving a running system should include one.
What it measures: How many physical instruments the technician connects per test.
Why it matters: Probe count directly determines diagnostic quality. A test with 3 probes captures temperature only. A test with 9+ probes captures temperature, pressure, airflow, and electrical, giving the full picture needed for a valid Vitals Score.
Target: 9 or more physical probes on cooling and heating tests. 7 or more on gas furnace tests. These thresholds match what measureQuick requires to calculate a Vitals Score.
measureQuick Cloud Dashboard showing company statistics, project counts, and technician filter options
What it measures: The percentage of tests where refrigerant charge passes.
Why it matters: Industry data from over 115,000 quality-filtered cooling tests shows a 56.0% charge failure rate on piston-metered systems. Your company's rate tells you how your work compares to the national baseline. A rate significantly worse than 56.0% may indicate installation or service quality issues. A rate significantly better suggests your technicians are doing above-average work, or may indicate testing methodology issues worth investigating.
Target: Track your own trend over time. If your charge failure rate is 60% and drops to 50% over six months, that is measurable improvement. Setting an absolute target below the industry average is reasonable for companies doing quality-focused work.
What it measures: The percentage of tests where duct system static pressure is within acceptable limits.
Why it matters: Over 70% of systems nationally exceed the 0.50-inch water column static pressure rating. High static pressure shortens equipment life and increases energy costs. Tracking your TESP pass rate shows whether your installations and duct modifications are improving airflow.
Target: Your company's rate should trend upward over time. If you are performing duct modifications or new installations, the TESP pass rate on those jobs should be meaningfully better than your maintenance-only calls.
What it measures: The average 0-100 system health score across all tests.
Why it matters: The Vitals Score is a composite of multiple subsystem results. A rising average means your work is producing measurably healthier systems. A declining average may indicate changing customer mix, seasonal effects, or quality drift.
Target: Track your own average and trend. A company Vitals Score average of 75+ on test-out results indicates consistently good work. Compare test-in averages (what you find) to test-out averages (what you leave behind) for the clearest picture.
What it measures: The percentage of paired tests where the test-out Vitals Score is higher than the test-in.
Why it matters: This is the single most important metric for demonstrating that your work improves systems. A test-in score of 45 and a test-out score of 82 is concrete, measurable proof of value.
Target: 90%+ of repair jobs should show improvement from test-in to test-out. Jobs where the test-out is worse than test-in (it happens) should be investigated immediately.
What it measures: The percentage of pass/fail results where the technician manually changed the app's determination.
Why it matters: Some overrides are legitimate. A technician may override a charge failure because the system just had refrigerant added and needs run time to stabilize. But a consistently high override rate across many tests suggests the technician is disagreeing with the diagnostics frequently, which may mean they need calibration training or that they are overriding failures to avoid uncomfortable customer conversations.
Target: Below 15% as a company average. Individual technicians consistently above 20% warrant a conversation. Ask them to explain their last 5 overrides; the answers will tell you whether the overrides are justified.
Before setting targets, collect 30 days of data without announcing what you are measuring. This gives you an honest baseline. If your company's average probe count is 6.2 and your charge failure rate is 62%, those are your starting points.
measureQuick's aggregated data provides national benchmarks:
| Metric | Industry Benchmark |
|---|---|
| Charge failure rate (piston) | 56.0% |
| TESP failure rate | 70%+ exceed 0.50" |
| Venting failure rate | 29.6% |
| Heat pump market share (2025) | 47.0% |
These benchmarks give you context, not targets. Your goal is to improve relative to your own baseline and to understand where you stand relative to the industry.
Do not set targets that require a 50% improvement in 30 days. Set quarterly goals that move the needle by 5-10%. If your average probe count is 6.2, target 7.5 in Q1, 8.5 in Q2, and 9.0 in Q3. Incremental progress sticks. Dramatic targets demoralize teams.
Block 30 minutes on your calendar each month. Pull the following from the dashboard for each technician:
Present metrics to the team in a group setting with individual follow-ups as needed. The group review sets expectations and creates peer accountability. Individual follow-ups address specific issues without public embarrassment.
Group review agenda (15 minutes):
Individual follow-ups (as needed):
Cloud dashboard with company statistics, active users, projects, and equipment counts
Metrics reveal specific skill gaps. Here is what to look for:
The technician is not connecting enough instruments and is overriding failures they cannot accurately assess. Training need: probe setup and connection procedures (B1, D3).
The technician runs tests but skips the before-and-after workflow. Training need: test-in/test-out workflow reinforcement (D5). This is often a time pressure issue; the technician may need support in explaining the process to dispatch.
Either the technician is encountering worse systems (check their service area), or they are making measurement errors. Training need: review refrigerant measurement procedures (E4, E5) and verify probe calibration.
If a technician overrides venting failures 80% of the time, they may not understand the measurement or may disagree with the threshold. Training need: targeted review of that subsystem's diagnostic criteria and thresholds.
The technician is working offline and not syncing. This is not a skill issue; it is a process issue. Remind them to sync daily and verify cloud sync is enabled in settings (K1).
Any metric that is measured will eventually be gamed if the incentive structure encourages it. A technician told to "get your probe count up" may connect 9 probes without placing them correctly. A technician told to "reduce your override rate" may stop overriding legitimate exceptions.
Measure outcomes, not just compliance. The most important metric is test-in to test-out improvement, because it measures actual impact on the system. A technician cannot game an improvement in Vitals Score; it requires real measurement and real work.
Combine quantitative metrics with periodic qualitative review. Ride along with each technician quarterly to observe their actual process. Does the probe placement match what the data shows? Are overrides accompanied by notes explaining the reasoning?
Avoid ranking technicians publicly on metrics that create perverse incentives. Do not reward the technician with the highest pass rate (they may be overriding failures). Do not punish the technician with the lowest Vitals Score average (they may be servicing the oldest, worst-maintained systems). Context matters. Use metrics to start conversations, not to hand out awards.
At least 20 tests per technician per metric period. Fewer than that and individual outlier tests skew the averages. For company-wide metrics, 100+ tests per month gives you reliable trends.
Segment your metrics by test type if possible. A technician who only does maintenance will have different numbers than one who does installations. Compare like to like.
Start with a spreadsheet. Pull numbers from the dashboard monthly and enter them manually. It takes 15 minutes per month and is sufficient for companies with fewer than 20 technicians. If you outgrow the spreadsheet, consider exporting data for analysis in a dedicated tool.
Metrics measure what the app can see, not everything that matters. Professionalism, communication, cleanliness, punctuality - these are real quality indicators that do not show up in the dashboard. Use mQ metrics alongside, not instead of, customer feedback.
Prerequisites (complete these first):
Follow-up articles (next steps after this one):
Related in the same domain:
If you get stuck or this article does not answer your question: