Can Vision-Language Models Reliably Extract Equipment Tags from P&IDs?

Author: Rosary Labs Team

Date: 4th January 2026

Summary

We experimented with five state-of-the-art Vision Language Models to extract equipment tags from Piping and Instrumentation Diagrams (P&IDs). Our findings revealed varying levels of accuracy, hallucination tendencies, and practical limitations when using VLMs for equipment tag extraction purposes.

How did it all started?

We are curious with Vision Language Models (VLMs) generic capability to extract equipment tags from Piping & Instrumentation Diagram (P&ID), specifically their accuracy, and so we tested a few VLMs, such as ChatGPT 5.2, Claude Opus 4.5, Gemini 3 Pro Preview, GLM 4.6v and Qwen3 235B.

Experiment Setup

To ensure reliable comparison, we created a ground-truth dataset from manual extractions, which we used to act as the accuracy benchmark of VLM-generated results.

Here's the P&ID diagram used:

Ground Truth Dataset

Manually extracted by an instrumentation engineer from the P&ID diagram:

Sensors (57)

TAH10X, XA10X, XR10X, HS10X, ZI10X, LIR100, LAL100, LAH100, PIR100, LAL101, LAH102, PIR101, FIR100, PAL101, PAH101, ZA104, FI100, TI10X, XA10X, XR10X, HS10X, ZI10X, LI100, PIT100, LI101, LI102, PIT101, FIT100, ZI104, HS10X, YLAR, YLG, LCP100, OIT10X, PIT101, FIT100, PI101, LIT100, LE100, LS101, LS102, ZS104, PIT100, FIR101, FIA101, FIT101, PAH200A, PAL200B, PIR200C, PI200, PIT200, PDIT201, PDI201, PDIR201B, PDAH201A, PI101

Valves (32)

V-101, V-102, V-103, V-104, V-105, V-106, V-107, V-108, V-109, V-110, V-111, V-112, V-113, V-114, V-117, V-201, V-202, V-203, V-204, V-205, V-206, V-208, V-209, V-210, V-212, V-214, V-216, V-218, V-219, V-220, V-221, V-222

Pumps

P-101, P-102, P-103

Motors

M-101, M-102, M-103

Filters

F-201, F-202, F-203

Time to experiment with Vision Language Models

We tested these Vision Language Models, each representing the most advanced capabilities available from reputable Language Model players: Open AI, Anthropic, Google, Alibaba and Z.ai:

  1. Claude Opus 4.5
  2. Gemini 3 Pro Preview
  3. Open AI GPT5.2
  4. Qwen3 VL 235B A22B Instruct
  5. Z.ai GLM-4.6V

And the prompt?

Extract all the equipment tags from this attached pipeline and instrumentation diagram, be as thorough as you can as if your life depends on it. Extract it as an instrumentation engineers, for the purpose of asset registry. Extract unique engineering tag identifiers only. Format your output as a block of newline separated text. Go!

Our initial prompt revealed unexpected VLM behaviors that honestly caught us by surprise!.

Vision Language Models Output

Claude Opus 4.5

Equipment Tag Extraction - PID-102 (MONK HILL TREATMENT SYSTEM - GAC TANKS)

Based on thorough review of the P&ID, here are all unique equipment tags identified:

  • P-101
  • P-102
  • P-103
  • M-101
  • M-102
  • M-103
  • F-201
  • F-202
  • F-203
  • V-219
  • V-220
  • V-221
  • V-222
  • V-223
  • V-224
  • V-225
  • V-226
  • V-101
  • V-102
  • V-103
  • V-104
  • V-105
  • V-106
  • V-107
  • V-108
  • V-109
  • V-110
  • V-111
  • V-112
  • V-113
  • V-114
  • V-115
  • V-116
  • V-117
  • PI-101
  • PI-102
  • PI-103
  • PI-104
  • FI-101
  • FI-102
  • FI-103
  • FI-104
  • LI-101
  • LI-102
  • LI-103
  • LS-101
  • LS-102
  • LS-103
  • LE-105
  • LCP 100
  • OIT 10K
  • FE-101
  • FE-102
  • FE-103
  • FE-104
  • FE-105
  • PSH-201
  • PSH-202
  • PSH-203
  • GAC-201
  • GAC-202
  • GAC-203
  • CV-101
  • CV-102
  • CV-103
  • CV-104
  • CV-105
  • CV-106
  • CV-107
  • CV-108
  • YS-100
  • GB-101

Notes:

  • Pumps: P-101, P-102, P-103 (Booster Pumps - 3,500 GPM @ 300' TDH each)
  • Motors: M-101, M-102, M-103 (350 HP each)
  • Filters: F-201, F-202, F-203 (Bag Filters - Calgon Carbon furnished, 10 micron)
  • Valves: V-series (manual and automated isolation/control valves)
  • Instrumentation: PI (Pressure Indicators), FI/FE (Flow), LI/LS/LE (Level), PSH (Pressure Switch High)
  • Control: LCP 100, OIT 10K, YS-100

Gemini 3 Pro Preview

Based on a detailed inspection of the P&ID (PID-102), here are the extracted unique engineering equipment tags for the asset registry.


Booster Pumps & Motors:

  • P-101
  • M-101
  • P-102
  • M-102
  • P-103
  • M-103

Bag Filter Vessels:

  • F-201
  • F-202
  • F-203

Valves (Manual, Check, Automatic):

  • V-101
  • V-102
  • V-103
  • V-104
  • V-105
  • V-106
  • V-107
  • V-108
  • V-109
  • V-110
  • V-111
  • V-113
  • V-117
  • V-200
  • V-201
  • V-202
  • V-203
  • V-204
  • V-205
  • V-206
  • V-216
  • V-217
  • V-218
  • V-219
  • V-220
  • V-221
  • V-222

Instrumentation (Sensors, Indicators, Transmitters):

  • FI-100
  • LIT-100
  • LE-100
  • LS-100
  • PI-101
  • FIT-101
  • FIR-101
  • FAL-101
  • FAH-101
  • LS-101
  • PIT-101
  • PI-101
  • PIT-102
  • PI-102
  • LS-102
  • ZS-104
  • LAL-100
  • LAH-100
  • PIR-100
  • PIR-101
  • PAL-100
  • PAH-100
  • TA-10X
  • FA-10X
  • TAH-10X
  • FI-10X
  • YA-10X
  • HS-10X
  • ZI-10X
  • YL-10X
  • LCP-100
  • PIT-200
  • PIR-200
  • PAD-200
  • PAL-200
  • PDT-201
  • PI-101
  • PDT-201
  • PDAH-201A
  • PDR-201B
  • OIT-10X
  • FIT-101
  • FR-101

GPT 5.2

I need to stop you here to avoid doing something actively harmful for an asset registry.

The image you attached is a low-resolution raster preview, not the original PDF. At this resolution, most P&ID tag text is not legible enough to be read with engineering certainty. Extracting tags from this image would require guessing, and guessing tag IDs is unacceptable for instrumentation, safety, maintenance, or asset management purposes.

If I proceeded anyway, I would almost certainly hallucinate tags that do not exist or misread critical identifiers, which would corrupt your asset register.


What I need to do this correctly

Please upload one of the following:

  • The original vector PDF (preferred)
  • Or a high-resolution scan (minimum ~300 DPI where tag text is clearly readable)
  • Or cropped zoomed-in images of each P&ID area where tags are readable

Once I have that, I will:

  • OCR every readable tag
  • Normalize tag formatting
  • Deduplicate strictly
  • Extract only true engineering tag identifiers (no notes, no descriptions)
  • Output exactly what you requested: a block of newline-separated unique tags, suitable for direct ingestion into an asset registry

Why this matters

For P&IDs:

  • A single character error (e.g. PT-101 vs PT-110) is a different physical asset
  • Instrumentation engineers rely on absolute fidelity
  • "Best guess" extraction is worse than no extraction

Upload the original PDF or higher‑resolution images and say "Proceed" — I'll take it from there and be as thorough and ruthless as you want.

Qwen3 VL 235B A22B Instruct

  • P-101
  • P-102
  • P-103
  • M-101
  • M-102
  • M-103
  • V-106
  • V-107
  • V-108
  • V-109
  • V-110
  • V-111
  • V-113
  • V-117
  • V-201
  • V-202
  • V-203
  • V-204
  • V-205
  • V-206
  • V-207
  • V-208
  • V-209
  • V-210
  • V-211
  • V-214
  • V-218
  • V-219
  • V-220
  • V-221
  • V-222
  • F-201
  • F-202
  • F-203
  • LS-101
  • LS-102
  • LE-100
  • FIT-100
  • FIT-101
  • FIR-101
  • PI-101
  • PIT-101
  • PDIT-201
  • PDAH-201A
  • PDIR-201B
  • PAH-200A
  • PAL-200B
  • PIR-200C
  • ZI-101
  • ZI-102
  • ZI-103
  • ZI-104
  • UR-100
  • LAL-101
  • LAH-102
  • LIR-101
  • LAR-101
  • LAR-102
  • FAL-100
  • TAH-10X
  • KA-10X
  • KB-10X
  • HS-10X
  • HS-10X
  • YL-10X
  • YLA-R
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-10X
  • HS-1

The list goes on and on…

Z.ai GLM 4.6v

I need to extract all the equipment tags from the provided P&ID (Pipeline and Instrumentation Diagram). I'll analyze the diagram carefully to identify all the unique engineering tag identifiers.

Looking at the P&ID, I can see various equipment tags throughout the diagram. Let me go through it systematically:

  1. Pumps:
    • P-101
    • P-102
    • P-103
  2. Filters:
    • F-201
    • F-202
    • F-203
  3. Valves:
    • V-219
    • V-220
    • V-201
    • V-202
    • V-203
    • V-204
    • V-205
    • V-206
    • V-210
    • V-211
    • V-212
    • V-213
    • V-214
    • V-215
    • V-216
    • V-217
    • V-218
    • V-221
    • V-222
    • V-223
  4. Pressure Indicators:
    • PI-101
    • PI-102
    • PI-103
    • PI-104
    • PI-105
    • PI-106
    • PI-107
    • PI-108
    • PI-109
    • PI-110
    • PI-111
    • PI-112
    • PI-113
    • PI-114
    • PI-115
    • PI-116
    • PI-117
    • PI-118
    • PI-119
    • PI-120
    • PI-121
    • PI-122
    • PI-123
    • PI-124
    • PI-125
    • PI-126
    • PI-127
    • PI-128
    • PI-129
    • PI-130
    • PI-131
    • PI-132
    • PI-133
    • PI-134
    • PI-135
    • PI-136
    • PI-137
    • PI-138
    • PI-139
    • PI-140
    • PI-141
    • PI-142
    • PI-143
    • PI-144
    • PI-145
    • PI-146
    • PI-147
    • PI-148
    • PI-149
    • PI-150
    • PI-151
    • PI-152
    • PI-153
    • PI-154
    • PI-155
    • PI-156
    • PI-157
    • PI-158
    • PI-159
    • PI-160
    • PI-161
    • PI-162
    • PI-163
    • PI-164
    • PI-165
    • PI-166
    • PI-167
    • PI-168
    • PI-169
    • PI-170
    • PI-171
    • PI-172
    • PI-173
    • PI-174
    • PI-175
    • PI-176
    • PI-177
    • PI-178
    • PI-179
    • PI-180
    • PI-181
    • PI-182
    • PI-183
    • PI-184
    • PI-185
    • PI-186
    • PI-187
    • PI-188
    • PI-189
    • PI-190
    • PI-191
    • PI-192
    • PI-193
    • PI-194
    • PI-195
    • PI-196
    • PI-197
    • PI-198
    • PI-199
    • PI-200
    • PI-201
    • PI-202
    • PI-203
    • PI-204
    • PI-205
    • PI-206
    • PI-207
    • PI-208
    • PI-209
    • PI-210
    • PI-211
    • PI-212
    • PI-213
    • PI-214
    • PI-215
    • PI-216
    • PI-217
    • PI-218
    • PI-219
    • PI-220
    • PI-221
    • PI-222
    • PI-223
    • PI-224
    • PI-225
    • PI-226
    • PI-227
    • PI-228
    • PI-229
    • PI-230
    • PI-231
    • PI-232
    • PI-233
    • PI-234
    • PI-235
    • PI-236
    • PI-237
    • PI-238
    • PI-239
    • PI-240
    • PI-241
    • PI-242
    • PI-243
    • PI-244
    • PI-245
    • PI-246
    • PI-247
    • PI-248
    • PI-249
    • PI-250
    • PI-251
    • PI-252
    • PI-253
    • PI-254
    • PI-255
    • PI-256
    • PI-257
    • PI-258
    • PI-259
    • PI-260
    • PI-261
    • PI-262
    • PI-263
    • PI-264
    • PI-265
    • PI-266
    • PI-267
    • PI-268
    • PI-269
    • PI-270
    • PI-271
    • PI-272
    • PI-273
    • PI-274
    • PI-275
    • PI-276
    • PI-277
    • PI-278
    • PI-279
    • PI-280
    • PI-281
    • PI-282
    • PI-283
    • PI-284
    • PI-285
    • PI-286
    • PI-287
    • PI-288
    • PI-289
    • PI-290
    • PI-291
    • PI-292
    • PI-293
    • PI-294
    • PI-295
    • PI-296
    • PI-297
    • PI-298
    • PI-299
    • PI-300
    • PI-301
    • PI-302
    • PI-303
    • PI-304
    • PI-305
    • PI-306
    • PI-307
    • PI-308
    • PI-309
    • PI-310
    • PI-311
    • PI-312
    • PI-313
    • PI-314
    • PI-315
    • PI-316
    • PI-317
    • PI-318
    • PI-319
    • PI-320
    • PI-321
    • PI-322
    • PI-323
    • PI-324
    • PI-325
    • PI-326
    • PI-327
    • PI-328
    • PI-329
    • PI-330
    • PI-331
    • PI-332
    • PI-333
    • PI-334
    • PI-335
    • PI-336
    • PI-337
    • PI-338
    • PI-339
    • PI-340
    • PI-341
    • PI-342
    • PI-343
    • PI-344
    • PI-345
    • PI-346
    • PI-347
    • PI-348
    • PI-349
    • PI-350
    • PI-351
    • PI-352
    • PI-353
    • PI-354
    • PI-355
    • PI-356
    • PI-357
    • PI-358
    • PI-359
    • PI-360
    • PI-361
    • PI-362
    • PI-363
    • PI-364
    • PI-365
    • PI-366
    • PI-367
    • PI-368
    • PI-369
    • PI-370
    • PI-371
    • PI-372
    • PI-373
    • PI-374
    • PI-375
    • PI-376
    • PI-377
    • PI-378
    • PI-379
    • PI-380
    • PI-381
    • PI-382
    • PI-383
    • PI-384
    • PI-385
    • PI-386
    • PI-387
    • PI-388
    • PI-389
    • PI-390
    • PI-391
    • PI-392
    • PI-393
    • PI-394
    • PI-395
    • PI-396
    • PI-397
    • PI-398
    • PI-399
    • PI-400
    • PI-401
    • PI-402
    • PI-403
    • PI-404
    • PI-405
    • PI-406
    • PI-407
    • PI-408
    • PI-409
    • PI-410
    • PI-411
    • PI-412
    • PI-413
    • PI-414
    • PI-415
    • PI-416
    • PI-417
    • PI-418
    • PI-419
    • PI-420
    • PI-421
    • PI-422
    • PI-423
    • PI-424
    • PI-425
    • PI-426
    • PI-427
    • PI-428
    • PI-429
    • PI-430
    • PI-431
    • PI-432
    • PI-433
    • PI-434
    • PI-435
    • PI-436
    • PI-437
    • PI-438
    • PI-439
    • PI-440
    • PI-441
    • PI-442
    • PI-443
    • PI-444
    • PI-445
    • PI-446
    • PI-447
    • PI-448
    • PI-449
    • PI-450
    • PI-451
    • PI-452
    • PI-453
    • PI-454
    • PI-455
    • PI-456
    • PI-457
    • PI-458
    • PI-459
    • PI-460
    • PI-461
    • PI-462
    • PI-463
    • PI-464
    • PI-465
    • PI-466
    • PI-467
    • PI-468
    • PI-469
    • PI-470
    • PI-471
    • PI-472
    • PI-473
    • PI-474
    • PI-475
    • PI-476
    • PI-477
    • PI-478
    • PI-479
    • PI-480
    • PI-481
    • PI-482
    • PI-483
    • PI-484
    • PI-485
    • PI-486
    • PI-487
    • PI-488
    • PI-489
    • PI-490
    • PI-491
    • PI-492
    • PI-493
    • PI-494
    • PI-495
    • PI-496
    • PI-497
    • PI-498
    • PI-499
    • PI-500
    • PI-501
    • PI-502
    • PI-503
    • PI-504
    • PI-505
    • PI-506
    • PI-507
    • PI-508
    • PI-509
    • PI-510
    • PI-511
    • PI-512
    • PI-513
    • PI-514
    • PI-515
    • PI-516
    • PI-517
    • PI-518
    • PI-519
    • PI-520
    • PI-521
    • PI-522
    • PI-523
    • PI-524
    • PI-525
    • PI-526
    • PI-527
    • PI-528
    • PI-529
    • PI-530
    • PI-531
    • PI-532
    • PI-533
    • PI-534
    • PI-535
    • PI-536
    • PI-537
    • PI-538
    • PI-539
    • PI-540
    • PI-541
    • PI-542
    • PI-543
    • PI-544
    • PI-545
    • PI-546
    • PI-547
    • PI-548
    • PI-549
    • PI-550
    • PI-551
    • PI-552
    • PI-553
    • PI-554
    • PI-555
    • PI-556
    • PI-557
    • PI-558
    • PI-559
    • PI-560
    • PI-561
    • PI-562
    • PI-563
    • PI-564
    • PI-565
    • PI-566
    • PI-567
    • PI-568
    • PI-569
    • PI-570
    • PI-571
    • PI-572
    • PI-573
    • PI-574
    • PI-575
    • PI-576
    • PI-577
    • PI-578
    • PI-579
    • PI-580
    • PI-581
    • PI-582
    • PI-583
    • PI-584
    • PI-585
    • PI-586
    • PI-587
    • PI-588
    • PI-589
    • PI-590
    • PI-591
    • PI-592
    • PI-593
    • PI-594
    • PI-595
    • PI-596
    • PI-597
    • PI-598
    • PI-599
    • PI-600
    • PI-601
    • PI-602
    • PI-603
    • PI-604
    • PI-605
    • PI-606
    • PI-607
    • PI-608
    • PI-609
    • PI-610
    • PI-611
    • PI-612
    • PI-613
    • PI-614
    • PI-615
    • PI-616
    • PI-617
    • PI-618
    • PI-619
    • PI-620
    • PI-621
    • PI-622
    • PI-623
    • PI-624
    • PI-625
    • PI-626
    • PI-627
    • PI-628
    • PI-629
    • PI-630
    • PI-631
    • PI-632
    • PI-633
    • PI-634
    • PI-635
    • PI-636
    • PI-637
    • PI-638
    • PI-639
    • PI-640
    • PI-641
    • PI-642
    • PI-643
    • PI-644
    • PI-645
    • PI-646
    • PI-647
    • PI-648
    • PI-649
    • PI-650
    • PI-651
    • PI-652
    • PI-653
    • PI-654
    • PI-655
    • PI-656
    • PI-657
    • PI-658
    • PI-659
    • PI-660
    • PI-661
    • PI-662
    • PI-663
    • PI-664
    • PI-665
    • PI-666
    • PI-667
    • PI-668
    • PI-669
    • PI-670
    • PI-671
    • PI-672
    • PI-673
    • PI-674
    • PI-675
    • PI-676
    • PI-677
    • PI-678
    • PI-679
    • PI-680
    • PI-681
    • PI-682
    • PI-683
    • PI-684
    • PI-685
    • PI-686
    • PI-687
    • PI-688
    • PI-689
    • PI-690
    • PI-691
    • PI-692
    • PI-693
    • PI-694
    • PI-695
    • PI-696
    • PI-697
    • PI-698
    • PI-699
    • PI-700
    • PI-701
    • PI-702
    • PI-703
    • PI-704
    • PI-705
    • PI-706
    • PI-707
    • PI-708
    • PI-709
    • PI-710
    • PI-711
    • PI-712
    • PI-713
    • PI-714
    • PI-715
    • PI-716
    • PI-717
    • PI-718
    • PI-719
    • PI-720
    • PI-721
    • PI-722
    • PI-723
    • PI-724
    • PI-725
    • PI-726
    • PI-727
    • PI-728
    • PI-729
    • PI-730
    • PI-731
    • PI-732
    • PI-733
    • PI-734
    • PI-735
    • PI-736
    • PI-737
    • PI-738
    • PI-739
    • PI-740
    • PI-741
    • PI-742
    • PI-743
    • PI-744
    • PI-745
    • PI-746
    • PI-747
    • PI-748
    • PI-749
    • PI-750
    • PI-751
    • PI-752
    • PI-753
    • PI-754
    • PI-755
    • PI-756
    • PI-757
    • PI-758
    • PI-759
    • PI-760
    • PI-761
    • PI-762
    • PI-763
    • PI-764
    • PI-765
    • PI-766
    • PI-767
    • PI-768
    • PI-769
    • PI-770
    • PI-771
    • PI-772
    • PI-773
    • PI-774
    • PI-775
    • PI-776
    • PI-777
    • PI-778
    • PI-779
    • PI-780
    • PI-781
    • PI-782
    • PI-783
    • PI-784
    • PI-785
    • PI-786
    • PI-787
    • PI-788
    • PI-789
    • PI-790
    • PI-791
    • PI-792
    • PI-793
    • PI-794
    • PI-795
    • PI-796
    • PI-797
    • PI-798
    • PI-799
    • PI-800
    • PI-801
    • PI-802
    • PI-803
    • PI-804
    • PI-805
    • PI-806
    • PI-807
    • PI-808
    • PI-809
    • PI-810
    • PI-811
    • PI-812
    • PI-813
    • PI-814
    • PI-815
    • PI-816
    • PI-817
    • PI-818
    • PI-819
    • PI-820
    • PI-821
    • PI-822
    • PI-823
    • PI-824
    • PI-825
    • PI-826
    • PI-827
    • PI-828
    • PI-829
    • PI-830
    • PI-831
    • PI-832
    • PI-833
    • PI-834
    • PI-835
    • PI-836
    • PI-837
    • PI-838
    • PI-839
    • PI-840
    • PI-841
    • PI-842
    • PI-843
    • PI-844
    • PI-845
    • PI-846
    • PI-847
    • PI-848
    • PI-849
    • PI-850
    • PI-851
    • PI-852
    • PI-853
    • PI-854
    • PI-855
    • PI-856
    • PI-857
    • PI-858
    • PI-859
    • PI-860
    • PI-861
    • PI-862
    • PI-863
    • PI-864
    • PI-865
    • PI-866
    • PI-867
    • PI-868
    • PI-869
    • PI-870
    • PI-871
    • PI-872
    • PI-873
    • PI-874
    • PI-875
    • PI-876
    • PI-877
    • PI-878
    • PI-879
    • PI-880
    • PI-881
    • PI-882
    • PI-883
    • PI-884
    • PI-885
    • PI-886
    • PI-887
    • PI-888
    • PI-889
    • PI-890
    • PI-891
    • PI-892
    • PI-893
    • PI-894
    • PI-895
    • PI-896
    • PI-897
    • PI-898
    • PI-899
    • PI-900
    • PI-901
    • PI-902
    • PI-903
    • PI-904
    • PI-905
    • PI-906
    • PI-907
    • PI-908
    • PI-909
    • PI-910
    • PI-911
    • PI-912
    • PI-913
    • PI-914
    • PI-915
    • PI-916
    • PI-917
    • PI-918
    • PI-919
    • PI-920
    • PI-921
    • PI-922
    • PI-923
    • PI-924
    • PI-925
    • PI-926
    • PI

Evaluation of VLMs Results

Quantitative Evaluation

We manually extracted all 100 equipment tags from the PDF above, and we used a simple confusion matrix to evaluate the performance of each Vision Language Model. We did not evaluate GPT-5.2 and GLM 4.6V because GPT-5.2 refuses to perform extraction on the PDF supplied, and GLM 4.6V hallucinates.

Based on this simple confusion matrix, here's the tabulated results

MetricClaude Opus 4.5Gemini 3 Pro PreviewGPT-5.2Qwen3 VL 235B A22B InstructZ.ai GLM 4.6V
True Positive2834-32-
False Positive4445-49-
False Negative2619-17-
Precision0.390.43-0.40-
Recall0.520.64-0.65-
F1 Score0.440.52-0.49-

From the results above, it seems like Gemini 3 Pro Preview model achieves the highest f1-score, Qwen3 follows closely behind, and then Claude Opus 4.5.

As you may have observed from the result above, generic VLMs (at least the ones that we tested) at the present stage can't achieve a very high accuracy rate.

Qualitative Evaluation

Qwen3 and Z.ai GLM extraction of piping & instrumentation diagram tags at the early of the prompt results are accurate. Unfortunately from the results above, we observe that Qwen3 went into a repetitive spiral at HS-10X despite the higher recall score above. This immediately rules out its ability to produce consistent results.

What's left are Claude Opus 4.5 and Gemini 3 Pro. Between Claude Opus 4.5 and Gemini 3 Pro, the latter is able to extract more equipment tags, despite the former being a more superior model in the Claude series.

Additionally, Gemini 3 Pro is able to classify the asset tag according to its instrumentation type: pumps & motor, filters and valves. This shows that modern VLM has a foundational understanding of the P&ID tagging convention.

Disclaimer: it could well be that Claude Opus 4.5 will require a different prompt in order for it to categorise the equipment tags into the right category. But over here, we can see that Gemini 3 Pro automatically classifies the asset tag without explicit instructions given. Good job Gemini 3 Pro!

However, since we only evaluate this using a simple confusion matrix. There are many evaluation metrics that we do not account for. One of the most important evaluation metrics are speed of generating responses and costing. We also look into equally important qualitative evaluation metrics such as consistency in generating similar responses and categorisation of information for readability. Stay tuned to learn more about how we evaluate performance of an AI system in the future posts!

What to take away?

Insofar, we think that VLMs defy our expectation in its ability to extract P&ID equipment tags somewhat readily, considering that the traditional method would have been a combination of text object detection + OCR, which is far less simple than simply uploading a P&ID to a chatbot and being able to extract at least 40% of the equipment tags right.

Nevertheless, current VLMs lack the accuracy needed for production-grade P&ID analysis. Closing this gap requires various pre-processing and post-processing steps combine VLMs with other pre-Language Model era machine learning techniques.

In the next article, we will talk about other evaluation metrics that are commonly used to compare the performance of the entire extraction system. Stay tuned!

Get in Touch

Interested in learning more? Schedule a call with us and we'll get back to you shortly.