sql中替换逗号为换行符_使用T-SQL将逗号或其他定界符转换为SQL Server中的表或列表

sql中替换逗号为换行符

Database developers often need to convert a comma-separated value or other delimited items into a tabular format. Delimiters include pipe “|”, hash “#”, dollar “$” and other characters. SQL Server compatibility level 130, and subsequent versions, support a string_split() function to convert delimiter-separated values to rows (table format). For compatibility levels under 130, developers have previously done this with a user-defined function, which incorporates a While loop or Cursor to extract the data.

数据库开发人员通常需要将逗号分隔的值或其他带分隔符的项目转换为表格格式。 分隔符包括竖线“ |”,哈希“#”,美元“ $”和其他字符。 SQL Server兼容性级别130和后续版本支持string_split()函数,以将定界符分隔的值转换为行(表格式)。 对于低于130的兼容性级别,开发人员以前使用用户定义的函数来完成此操作,该函数合并了While循环或游标以提取数据。

为什么过程或T-SQL查询需要用逗号分隔的输入? (Why is comma-separated input required for a procedure or T-SQL query?)

Execute a program in an iteration using a loop is a common method in the back-end application as well as the database. When iteration count is smaller than expected, the execution time can be less however the number of iterations is bigger which leads to longer processing time. For example, if the application is fetching data from a database with executing a piece of program in a loop by parsing different input parameter application will need to wrap the database response of each iteration. To handle this type of situation, we can execute the same program with a single execution with parsing the comma or delimiter separated values as an input parameter in the database and make it to tabular format in the T-SQL program to proceed with further logic.

使用循环在迭代中执行程序是后端应用程序以及数据库中的常见方法。 当迭代次数小于预期时,执行时间可能会减少,但是迭代次数会增加,这会导致更长的处理时间。 例如,如果应用程序通过解析不同的输入参数通过循环执行一段程序从数据库中获取数据,则应用程序将需要包装每次迭代的数据库响应。 为了处理这种情况,我们可以一次执行同一程序,方法是将逗号或分隔符分隔的值解析为数据库中的输入参数,然后在T-SQL程序中将其转换为表格格式,以进行进一步的逻辑处理。

sql中替换逗号为换行符_使用T-SQL将逗号或其他定界符转换为SQL Server中的表或列表_第1张图片

Today, many platforms have been changed to micro-service architecture. Not only application but database design could also be structured in micro-service-based architecture and most of the internal communication between the databases is done with integer references only. That is the reason to get some tricky way to achieve better execution from the database side.

今天,许多平台已更改为微服务架构。 不仅应用程序而且数据库设计也可以在基于微服务的体系结构中构建,并且数据库之间的大多数内部通信仅使用整数引用来完成。 这就是为什么要采取一些棘手的方法从数据库端实现更好的执行的原因。

Multiple inputs to the same parameter can be accomplished with comma-separated values in the input parameter of the stored procedure or input to the tabular function, and used with the table in a T-SQL statement. There may be more situations for this strategy when using T-SQL. While programming with SQL Server, a developer must break a string into elements using a separator. Using the split function against the string defines the separator as an input parameter. The entire string will be split and returned as a table. In certain circumstances, the input parameter can be a combination of two or more character-separated values for each set of input. For instance:

可以使用存储过程的输入参数中的逗号分隔值或表格式函数的输入来完成对同一参数的多个输入,并与T-SQL语句中的表一起使用。 使用T-SQL时,此策略可能会有更多情况。 使用SQL Server进行编程时,开发人员必须使用分隔符将字符串分成多个元素。 对字符串使用split函数会将分隔符定义为输入参数。 整个字符串将被拆分并作为表返回。 在某些情况下,输入参数可以是每组输入的两个或多个字符分隔值的组合。 例如:

N’203616, 198667, 193718, 188769,…’ N’1021|203616$1021|198667$1022|193718$1022|188769$…’ N’1021|203616,198667$1022|193718,188769$…’

N'203616,198667,193718,188769,...'N'1021 | 203616 $ 1021 | 198667 $ 1022 | 193718 $ 1022 | 188769 $…'N'1021 | 203616,198667 $ 1022 | 193718,188769 $…”

sql中替换逗号为换行符_使用T-SQL将逗号或其他定界符转换为SQL Server中的表或列表_第2张图片

There are advantages to single-thread execution with comma-separated values over multi-thread execution in SQL Server:

与SQL Server中的多线程执行相比,带有逗号分隔值的单线程执行有很多优点:

  • Single execution for part of the program, resulting in the number of input

    对该程序的一部分执行一次,导致输入数量
  • Single read or write operation over the database tables instead of several iterations using a loop

    对数据库表进行一次读取或写入操作,而不是使用循环进行多次迭代
  • One-time server asset use

    一次性服务器资产使用
  • No more hurdles on the back-end application to set up a response from multiple result sets

    后端应用程序不再需要设置来自多个结果集的响应的障碍
  • One-shot commit means that the database/SQL Server won’t block the transaction and slow things down. Thus, no blocking possibilities

    一键提交意味着数据库/ SQL Server不会阻止事务并降低处理速度。 因此,没有阻塞的可能性
  • Easy to oversee transaction consistency

    易于监督交易的一致性

使用循环分割功能 (Split function using loop)

In most cases, a developer writes the table-valued function with a loop to extract the data from comma-separated values to a list or table format. A loop extracts the tuple one by one with the recursion. Performance issues can arise when working with the large input parameter since it will take more time to extract it. Here, we have a simple function with the name of split_string, which is using a loop inside the table-valued function. This function allows two arguments or input parameters (1 – input string and 2 – delimiter):

在大多数情况下,开发人员使用循环编写表值函数,以将数据从逗号分隔的值提取为列表或表格式。 循环与递归一起提取元组。 使用大输入参数时,可能会出现性能问题,因为提取它会花费更多时间。 在这里,我们有一个名为split_string的简单函数,该函数在表值函数内部使用循环。 此函数允许两个参数或输入参数(1 –输入字符串和2 –分隔符):

CREATE FUNCTION split_string
(
    @in_string VARCHAR(MAX),
    @delimeter VARCHAR(1)
)
RETURNS @list TABLE(tuple VARCHAR(100))
AS
BEGIN
        WHILE LEN(@in_string) > 0
        BEGIN
            INSERT INTO @list(tuple)
            SELECT left(@in_string, charindex(@delimiter, @in_string+',') -1) as tuple
    
            SET @in_string = stuff(@in_string, 1, charindex(@delimiter, @in_string + @delimiter), '')
        end
    RETURN 
END

A result-set table returned by this function contains every substring of the input parameter string which is separated by the delimiter. The substrings in the table are in the order in which they happen in the input parameter. If the separator or delimiter doesn’t coordinate with any piece of the input string, then the output table will have only one component. The tabular function will return the result set in a row-column format from the comma-separated string.

该函数返回的结果集表包含由定界符分隔的输入参数字符串的每个子字符串。 表中的子字符串按它们在输入参数中出现的顺序排列。 如果分隔符或分隔符不与输入字符串的任何部分协调,则输出表将仅具有一个组成部分。 表格函数将从逗号分隔的字符串中以行列格式返回结果集。

SELECT * FROM split_string('1001,1002,1003,1004', ',')

Recent versions of SQL Server provide a built-in function string_split() to do the same task with the input parameters of the input string and delimiter. But there is another efficient way to perform this task using XML.

SQL Server的最新版本提供了内置函数string_split(),以对输入字符串和定界符的输入参数执行相同的任务。 但是还有另一种有效的方法可以使用XML执行此任务。

使用XML的分割功能 (Split function using XML)

The Split function using XML logic is a useful method to separate values from a delimited string. Using XML logic with the T-SQL based XQUERY approach results in the T-SQL not being characterized as a recursion using loop or cursor. Essentially, the input parameter string is converted to XML by replacing the XML node in the T-SQL articulation. The resulting XML is used in XQUERY to extract the node value into the table format. It also results in better performance.

使用XML逻辑的Split函数是一种有用的方法,可以将值与定界字符串分开。 将XML逻辑与基于T-SQL的XQUERY方法结合使用会导致T-SQL不能被描述为使用循环或游标的递归。 本质上,通过替换T-SQL关节中的XML节点,将输入参数字符串转换为XML。 生成的XML在XQUERY中用于将节点值提取为表格式。 这也导致更好的性能。

For example, we have a comma-separated value with the variable @user_ids in the below T-SQL statement. In a variable, a comma is replaced by the XML tags to make it an XML data type document and then extracted using XQUERY with the XPATH as ‘/root/U’:

例如,在下面的T-SQL语句中,我们有一个用逗号分隔的值和变量@user_ids。 在变量中,逗号被XML标记代替,使其成为XML数据类型文档,然后使用XQUERY将XPATH提取为'/ root / U':

DECLARE @user_ids NVARCHAR(MAX) = N'203616, 198667, 193718, 188769, 183820, 178871, 173922, 168973, 164024, 159075, 154126, 149177, 144228, 139279, 134330, 129381, 124432, 119483, 114534, 109585, 104636, 99687, 94738, 89789, 84840, 79891, 74942, 69993, 65044, 60095, 55146'
 
DECLARE @sql_xml XML = Cast(''+ Replace(@user_ids, ',', '')+ '' AS XML)
    
SELECT f.x.value('.', 'BIGINT') AS user_id
INTO #users
FROM @sql_xml.nodes('/root/U') f(x)
    
SELECT * 
FROM #users

sql中替换逗号为换行符_使用T-SQL将逗号或其他定界符转换为SQL Server中的表或列表_第3张图片

In the above function, we have used the VARCHAR (100) data type for the extracted value rather than BIGINT. There can be changes in the input parameter string, too. For example, if the input string has one extra ‘,’ at the beginning or end of the string, it returns one extra row with 0 for the BIGINT data type and ” for VARCHAR.

在上面的函数中,我们将VARCHAR(100)数据类型用于提取的值,而不是BIGINT。 输入参数字符串也可以更改。 例如,如果输入字符串在字符串的开头或结尾有一个额外的“,”,则它将返回一个额外的行,其中BIGINT数据类型的返回值为0 ,而VARCHAR的返回值为 “”。

For example,

例如,

DECLARE @user_ids VARCHAR(MAX) = N'203616, 198667, 193718, 188769, 183820, 178871,'
DECLARE @sql_xml XML = Cast(''+ Replace(@user_ids, ',', '')+ '' AS XML)
    
SELECT f.x.value('.', 'BIGINT') AS user_id
FROM @sql_xml.nodes('/root/U') f(x)

sql中替换逗号为换行符_使用T-SQL将逗号或其他定界符转换为SQL Server中的表或列表_第4张图片

To handle this type of ” or 0 data in the result set, we can add the condition below in the T-SQL statement:

要在结果集中处理这种类型的“”或0数据,我们可以在T-SQL语句中添加以下条件:

DECLARE @user_ids VARCHAR(MAX) = N'203616, 198667, 193718, 188769, 183820, 178871,'
DECLARE @sql_xml XML = Cast(''+ Replace(@user_ids, ',', '')+ '' AS XML)
    
SELECT f.x.value('.', 'BIGINT') AS user_id
FROM @sql_xml.nodes('/root/U') f(x)
WHERE f.x.value('.', 'BIGINT') <> 0

Here, f.x.value(‘.’, ‘BIGINT’) <> 0 excludes the 0 and f.x.value(‘.’, ‘VARCHAR(100)’) <> ” excludes the ” in the query result set. This T-SQL condition can be added into the table-valued functions with the number of input parameter to handle this extra delimiter.

在这里,fxvalue('。','BIGINT')<> 0排除0,而fxvalue('。','VARCHAR(100)')<>″在查询结果集中排除。 可以使用输入参数数量将此T-SQL条件添加到表值函数中,以处理此额外的定界符。

Function:

功能:

CREATE FUNCTION split_string_XML
(
    @in_string VARCHAR(MAX),
    @delimeter VARCHAR(1)
)
RETURNS @list TABLE(tuple VARCHAR(100))
AS
BEGIN
DECLARE @sql_xml XML = Cast(''+ Replace(@in_string, @delimeter, '')+ '' AS XML)
    
    INSERT INTO @list(tuple)
    SELECT f.x.value('.', 'VARCHAR(100)') AS tuple
    FROM @sql_xml.nodes('/root/U') f(x)
    WHERE f.x.value('.', 'BIGINT') <> 0
    
    RETURN 
END

Execution:

执行:

SELECT * 
FROM split_string_XML('203616,198667,193718,188769,183820,178871,173922,', ',')

以分隔符分隔的字符串中的字符组合 (Combination of characters in delimiter-separated string)

But what about when the input parameter is a combination of two values with multiple separators? In this case, the input parameter must be extracted twice with the help of a subquery, as shown below:

但是,当输入参数是两个值与多个分隔符的组合时,怎么办? 在这种情况下,必须借助子查询将输入参数提取两次,如下所示:

DECLARE @in_string VARCHAR(MAX) = '1021|203616$1021|198667$1022|193718$1022|188769$'
DECLARE @sql_xml XML = Cast(''+ Replace(@in_string, '$', '')+ '' AS XML)
    
SELECT X.Y.value('(U)[1]', 'VARCHAR(20)') AS role_id,
        X.Y.value('(U)[2]', 'VARCHAR(20)') AS user_id
FROM
(
SELECT Cast(''+ Replace(f.x.value('.', 'VARCHAR(MAX)'), '|', '')+ '' AS XML) AS xml_
FROM @sql_xml.nodes('/root/U') f(x)
WHERE f.x.value('.', 'VARCHAR(MAX)') <> ''
)T
OUTER APPLY T.xml_.nodes('root') as X(Y)

This example is similar to the above example but we have utilized pipe “|” as a delimiter with a second delimiter, dollar “$”. Here, the input parameter is the combination of two-columns, role_id, and user_id. As we see below, user_id and role_id are in a separate column in the table result set:

这个例子与上面的例子相似,但是我们使用了管道“ |” 作为带有第二个定界符的定界符,美元“ $”。 在这里,输入参数是两个列的组合,即role_id和user_id。 如下所示,user_id和role_id位于表结果集中的单独列中:

sql中替换逗号为换行符_使用T-SQL将逗号或其他定界符转换为SQL Server中的表或列表_第5张图片

We can use any character in the above function, such as a single character-separated string or combination of character-separated strings. A developer must simply replace the character in the T-SQL statements above. Follow the steps described here to convert any delimited string to a list. Simply remember that the XML function of SQL Server accepts a string input and the input string will be delimited with the specified separator using the tabular function.

我们可以在上述函数中使用任何字符,例如单个字符分隔的字符串或字符分隔的字符串的组合。 开发人员必须简单地替换上述T-SQL语句中的字符。 请按照此处描述的步骤将任何定界字符串转换为列表。 只需记住,SQL Server的XML函数接受字符串输入,并且将使用表格函数使用指定的分隔符来分隔输入字符串。

翻译自: https://www.sqlshack.com/converting-commas-or-other-delimiters-to-a-table-or-list-in-sql-server-using-t-sql/

sql中替换逗号为换行符

你可能感兴趣的:(字符串,数据库,python,java,mysql)