Extract string from column SQL Server

6qqygrtg  于 2023-10-15  发布在  SQL Server
关注(0)|答案(3)|浏览(133)

I am trying to extract an OrderID from a text column in SQL Server. The order number does not always appear in the same place in the column.

Example data shown here:

Bill Jones ORD001234 code 0839 Swindon
ORD01235 - James Brown Code 3838 Norwich
ORD01236, - Jill Scott Code 9838 Glasgow
ORD0123 - Alan Ball - Bath - Code 0879

The ORD00000 value is what I need to extract from the column. However the Order value can be 7-9 digits in length

This is what I originally tried;

substring ([column], 1, CHARINDEX(' ', [column])) AS OrderNbr

Also gave this a go but the length of the order number did not help. I'm thinking some kind of case statement might work?

,substring([Notes], patindex('%ORD%', [Notes]), 8)
nhaq1z21

nhaq1z211#

One has to wonder why the order number is just dumped into a string with other information and not stored in its own column, or at least why the format isn't standardized.

Part of the problem is that looking for '%ORD%' must ignore a name like Frank B***ord***en. The solution must also be able to survive a value withoutORD at all or with leading garbage right beforeORD .

Here's one way to do that:

;WITH FirstPass AS 
(
  SELECT Notes, 
         pos = PATINDEX('%ORD[0-9][0-9]%', Notes),
         rt = SUBSTRING(Notes, COALESCE(NULLIF(
              PATINDEX('%ORD[0-9][0-9]%', Notes), 0),1), 4000)
  FROM dbo.TableName
)
SELECT Notes, OrderNumber = LEFT(rt, 
           2 + COALESCE(NULLIF(PATINDEX('%[^0-9]%', 
           SUBSTRING(rt, 4, 4000)),0),4000))
  FROM FirstPass
  WHERE pos > 0;

And now you know why we tend to follow patterns where each piece of data (like an order number) is stored independently, in its own column: no ugly parsing required. :-)

jdzmm42g

jdzmm42g2#

Try this: select substring(substring([Notes], patindex('%ORD%', [Notes]), 10),1,patindex('% %',substring([Notes], patindex('%ORD%', [Notes]), 10))-1) it first grabs the orderId with extra then clips everything after the space (including the space)

syqv5f0l

syqv5f0l3#

Please try the following solution that is using tokenization instead of parsing.

Notable points:

  • CROSS APPLY is tokenizing tokens column as XML.
  • XQuery .value() method is retrieving order token, aka order_number.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (ID INT IDENTITY PRIMARY KEY, tokens VARCHAR(1024));
INSERT @tbl (tokens) VALUES
('Bill Jones ORD001234 code 0839 Swindon'),
('ORD01235 - James Brown Code 3838 Norwich'),
('ORD01236, - Jill Scott Code 9838 Glasgow'),
('ORD0123 - Alan Ball - Bath - Code 0879');
-- DDL and sample data population, end

DECLARE @separator CHAR(1) = SPACE(1);

SELECT t.* 
    , c.value('(/root/r[contains(text()[1],"ORD")]/text())[1]', 'VARCHAR(20)') AS order_number 
FROM @tbl AS t
    CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' + 
        REPLACE(REPLACE(tokens,',',''), @separator, ']]></r><r><![CDATA[') + 
        ']]></r></root>' AS XML)) AS t1(c);

Output

IDtokensorder_number
1Bill Jones ORD001234 code 0839 SwindonORD001234
2ORD01235 - James Brown Code 3838 NorwichORD01235
3ORD01236, - Jill Scott Code 9838 GlasgowORD01236
4ORD0123 - Alan Ball - Bath - Code 0879ORD0123

相关问题