linq入门_LINQ入门指南,第1部分

linq入门

In the tech world, acronyms are rife. There are hardware acronyms: SATA, IC, ACPI. There are software acronyms: SQL, J2EE, ASP. There are even acronyms for certifications of one's knowledge of a particular domain of acronyms: CISSP, MCPD, ISA. Any technology company who has had an impact in the field is sure to have introduced their own set of acronyms to the fray. One particularly "new kid on the block" was introduced by Microsoft circa 2007: LINQ. In this article, I intend to provide an introduction for everyone to LINQ and its uses.

在科技界,首字母缩略词很流行。 有硬件缩写:SATA,IC,ACPI。 有软件缩写:SQL,J2EE,ASP。 甚至有一个缩写词代表对某个特定缩写词领域的知识的认证:CISSP,MCPD,ISA。 任何在该领域产生影响的技术公司都一定会在竞争中引入自己的首字母缩写词。 大约在2007年,Microsoft引入了一个特别“新手”:LINQ。 在本文中,我打算向所有人介绍LINQ及其用途。

While I will attempt to explain the topic in a manner suitable for even a beginner, this article is intended for an audience with some level of programming experience. New programmers may want to hold off on reading the article until they have gained a basic understanding of programming fundamentals. The content of this article will be revolve primarily around LINQ-to-Objects, though some of the concepts discussed will apply to LINQ-to-XML and LINQ-to-SQL.

尽管我将尝试以适合初学者的方式解释该主题,但本文的读者对象是具有一定程度的编程经验的读者。 新程序员可能希望推迟阅读本文,直到他们对编程基础知识有了基本了解。 尽管本文讨论的某些概念将适用于LINQ-to-XML和LINQ-to-SQL,但本文的内容主要围绕LINQ-to-Objects。

什么是LINQ? (What is LINQ?)

LINQ(语言集成查询)是Microsoft创造的一种“弥合对象世界与数据世界之间的鸿沟”的技术。(1)这听起来像是对我的营销炒作。 但是,在某些方面,LINQ就是这样:代码和某些数据源之间的桥梁。 但是,不要将术语“数据源”严格地表示为“数据库”。 就我的目的而言,“数据”是指某些信息,而“源”是指该数据的某些起源。 在LINQ-land中,数据源可以是文本文件,XML文件,内存中的对象以及yes ...数据库。

Aside from being some magical way of joining your code to your source of data, what else should you know about LINQ before diving in? First, it is a feature of the language you develop in. You can write LINQ queries (that's the "Q" in "LINQ" after all) right inside of your regular .NET code. The designers of each .NET language (e.g. C#, VB.NET, F#, etc.) have included specific language keywords which you can use to build your queries. Next, for your introduction to LINQ think of it as a supercharged foreach (For Each - VB.NET) loop. If you have experience in .NET, then you should be familiar with "for each" loops. Key to understanding how LINQ does what it does is understanding how a "for each" loop works. To understand how a "for each" loop works you need to understand the concept of an iterator.

除了以某种神奇的方式将代码连接到数据源之外,在深入研究之前,您还应该了解LINQ吗? 首先,它是您开发语言的功能。您可以在常规.NET代码内部编写LINQ foreachFor Each-VB.NET)循环。 如果您有.NET的经验,那么您应该熟悉“ for each”循环。 理解LINQ的工作方式的关键是理解“ for each”循环的工作方式。 要了解“为每个”循环的工作方式,您需要了解

迭代器 (Iterators)

对于新程序员而言,“迭代器”一词似乎令人生畏。 真的不是。 迭代器基本上是一种在某些集合中循环遍历元素的方法。 当迭代器遍历这些元素时,它将跟踪其位置(2)。 这样就可以知道哪些元素已经被访问,哪些元素尚未被访问。 可以将迭代器视为计数一行中的人数。 如果要计算一行中的人数,则可以在计数时指向每个人。 如果在您计数时有人打扰了您,而当您转身与该人交谈时,没有任何东西导致您的指向手移动,那么当您回头看那条线时,您仍会指向您最后计数的人。 迭代器等效于指向您最后计数的人的指向手。 (注意:我的意思是不要说“下一个要计数的人。”这是为了与迭代器的工作保持一致。)

So why on earth would one need to keep track of his position within an arbitrary collection of data? If he is using a "for each" loop to iterate over the whole collection, then he must want to interact with every piece of data in the collection, right? That is where the "each" part of the "for each" comes into play. "Each" in the English language corresponds to the quantification "one." When we use a "for each" loop, we are eventually going to examine every item in the collection (disregard side effects for now). We are going to do so one element at a time--even in the code that hosts the "for each." Having said that, recall that our iterator "remembers" where we are positionally within the collection. The compiler of our chosen language compiles our code in such a way that when we are in "for each" land, when our "for each" advances to the next element, we actually jump back into the code that created the iterator and we advance to the next item in the collection. Let us try another example.

那么,为什么人们需要在任意数据收集中跟踪自己的位置呢? 如果他正在使用“ for each”循环遍历整个集合,那么他必须要与集合中的每个数据交互,对吗? 这就是“每个”的“每个”部分起作用的地方。 英语中的“每个”对应于量词“一个”。 当我们使用“ for each”循环时,我们最终将检查集合中的每个项目(暂时忽略副作用)。 我们将一次执行一个元素,即使在托管“ for each”的代码中也是如此。 话虽如此,请记住我们的迭代器“记住”集合中的位置。 我们选择的语言的编译器以这样的方式编译我们的代码:当我们在“针对每个”领域时,当我们“针对每个”前进到下一个元素时,我们实际上跳回到创建迭代器的代码中,然后前进到集合中的下一个项目。 让我们尝试另一个示例。

Let us say that you are a factory worker. Your job is to take a line of buckets, each containing widgets, and one-by-one place the buckets on a conveyor belt to be used at various points along the assembly line. You are the iterator. The assembly line is the "for each" loop. When the conveyor belt starts, so does your work. You start with the first bucket, and you place it on the conveyor belt. The bucket proceeds through the assembly line. You have strict instructions not to proceed to the next bucket until the bucket you just sent comes back. You have no awareness as to how the bucket is being used on the assembly line; you only know that you cannot proceed to the next bucket until the bucket you just sent returns. As each bucket comes back to you, it arrives crushed, and there is nothing more you can do with a crushed bucket. You toss the unusable bucket aside and move on to the next bucket. This process continues until you exhaust the supply of buckets. This equivalent to how the iterator works under the hood and in conjunction with the "for each" loop.

假设您是工厂工人。 您的工作是拿起一行铲斗,每个铲斗都包含小部件,然后将铲斗一一放置在传送带上,以便在装配线的各个位置使用。 您是迭代器。 组装线是“针对每个”循环。 当传送带启动时,您的工作也将开始。 首先从第一个铲斗开始,然后将其放在传送带上。 铲斗穿过装配线。 您有严格的说明,直到刚发送的存储桶回来之前,才可以继续下一个存储桶。 您不了解铲斗在流水线上如何使用。 您只知道直到刚发送的存储桶返回之前,您才能继续下一个存储桶。 当每个铲斗返回给您时,它都会被击碎,而被击碎的铲斗将无济于事。 您将不可用的存储桶扔到一边,然后移至下一个存储桶。 此过程将继续进行,直到您用尽了所有的铲斗。 这等效于迭代器在后台以及与“ for each”循环一起工作的方式。

Even though as the factory worker you have no idea what the processes along the conveyor belt's path do with each bucket as they arrive, the work to supply new buckets comes back to you. In this same way, the code which creates the iterator has no clue as to what the "for each" code does with the data it supplied; it only knows that once execution returns to it, it should supply the next  piece of data. Furthermore, your duties do not include salvaging any unused widgets from the incoming bucket. They do not include trying to recycle any incoming buckets if they were not completely crushed. Your assignment is only to keep the conveyor belt running, and to do so one bucket at a time. So too does an iterator supply data, one element at a time. The iterator's only job is to keep supplying data to the caller as execution returns to it.

即使作为工厂工人,您都不知道每个铲斗到达时传送带路径上的处理过程如何,但供应新铲斗的工作又回到了您身上。 同样,创建迭代器的代码不知道“ for each”代码如何处理所提供的数据。 它只知道一旦执行返回,它就应该提供下一个数据。 此外,您的职责不包括从传入存储桶中回收任何未使用的小部件。 如果未将其完全压碎,则不包括尝试回收任何传入的桶。 您的任务只是保持传送带运行,并一次运行一个铲斗。 迭代器也提供数据,一次提供一个元素。 迭代器的唯一工作是在执行返回时继续向调用方提供数据。

So then how does execution return to the iterator? We all know that when a function returns, that is it. There is no resuming where we left off (not without some dirty GOTO statement, but you would never do that, right?). Once a function returns, we do not jump back into it without calling it again. It is the same in mathematical functions. When we say y = x^2 (x-squared), once we get the value of y, is there any way for us to jump back into the function and change the way y is calculated? Of course not. But then how does the iterator circumvent this seemingly illogical roadblock? As previously mentioned, the compiler does a bit of magic itself.

那么执行如何返回迭代器呢? 我们都知道,当函数返回时,就是这样。 没有恢复我们离开的地方(不是没有一些肮脏的GOTO语句,但是您永远不会那样做,对吗?)。 一旦函数返回,我们将不跳回函数而不再次调用它。 数学函数相同。 当我们说

IEnumerable,满足产量 (IEnumerable, Meet yield)

这是我们可以考虑的标准函数定义的示例:

{

{

    int z = x + y;

int z = x + y;

    return z;

返回z;

}

}

That is, take in some parameters (or maybe no parameters), do some logic, and return some result. The key to the above is the return keyword. No matter where we place return in a function, if the logic within the function causes us to hit a return, then we exit the function, possibly returning a value along the way. The compiler structures the code in such a way to ensure this happens. In a function which creates an iterator, however, this is not quite the case. Take the following:

也就是说,接受一些参数(或者可能没有参数),执行一些逻辑,然后返回一些结果。 上面的关键字是return关键字。 无论我们将return放在函数中的哪个位置,如果函数中的逻辑使我们命中return ,那么我们都将退出函数,可能会一直返回一个值。 编译器以确保发生这种情况的方式构造代码。 但是,在创建迭代器的函数中,情况并非如此。 采取以下措施:

{

{

    for (int j = 0; j < this._values.Length; j++)

for(int j = 0; j

    {

{

        yield return this._values[j];

产生收益this._values [j];

    }

}

}

}

And I am sure you are saying, "Whoa! What the heck is yield?" Well, yield is a special keyword which lets the compiler know that we intend on this function to return things in an iterative way (3). In other words, this function will return things like a normal function would, however, it will return every single item in the associated collection (_values in this case). So am I lying to you? I said earlier that functions return something and then there is no going back without calling the function again. That, my friend, is the magic of the yield keyword (and also the IEnumerable return type).

我相信你说的话,“哇!到底是什么收益呢?” 好吧, yield是一个特殊的关键字,它使编译器知道我们打算在此函数上以迭代方式返回事物(3)。 换句话说,此函数将返回与普通函数类似的结果,但是,它将返回关联集合中的每个单个项目(在这种情况下为yield关键字(以及IEnumerable返回类型)的魔力。

As I mentioned previously, the compiler will structure the compiled code in such a way that the runtime will pass whatever yield return returns back to the caller (e.g. a foreach loop), and when that caller is done with the current "iteration", execution will pick up at the next line of the code which creates the iterator (in the above, that would be the closing brace of the for loop). This is the same thing I explained in the conveyor belt example. The iteration of the bucket going off on the convey belt, and then eventually resuming with you placing the next bucket on the conveyor belt exemplifies this behavior.

正如我之前提到的,编译器将以如下方式构造编译后的代码:运行时将把任何收益返回返回给调用方(例如, foreach循环),并且当调用方完成当前的“迭代”后,执行将在创建迭代器的代码的下一行接起(在上面,这将是for循环的右括号)。 这与我在传送带示例中解释的内容相同。 铲斗在传送带上移动的迭代过程,然后最终随着您将下一个铲斗放在传送带上而重新开始,就说明了这一行为。

You may be wondering what would happen if you didn't use the yield keyword, and you just used return by itself. Well two things would happen:  1) the code will not compile because a yield return (in this example) returns a single string, but the function's definition expects an IEnumerable of strings; 2) assuming the code did compile, you would not get the results you expect. Remember that a return forces immediate exiting of the function--no going back. In this case, the yield return and the return of IEnumerable are both required. It may seem strange that one string at a time is being "returned" by the iterator, yet we are saying that this method returns an IEnumerable, but this is a requirement of the iterator:  the return type must be an IEnumerable.

您可能想知道如果不使用yield关键字,而仅使用return会发生什么。 好了两件事情会发生:1)代码将无法编译,因为收益率回报 (在这个例子中)返回一个字符串,但功能的定义预计字符串 s的IEnumerable; 2)假设代码确实可以编译,您将无法获得预期的结果。 请记住, 返回迫使函数立即退出-不能返回。 在这种情况下,必须同时提供yield returnIEnumerable 的返回。 迭代器一次“返回”一个字符串似乎很奇怪,但是我们说的是此方法返回IEnumerable ,但这是迭代器的要求:返回类型必须为IEnumerable

Now that you hopefully have some insight into the workings of iterators, let us examine how this fits together with LINQ.

现在,您希望对迭代器的工作情况有一些了解,让我们研究一下它与LINQ的配合情况。

迭代器和LINQ (Iterators and LINQ)

前面我提到LINQ是语言内置的。 编译器仍然需要进行一些转换,以使LINQ代码实际运行计算机可以理解的指令。 编译器会将您的LINQ查询转换为一系列方法调用(4)。 如果将 System.Linq namespace into your project, and you brought up Intellisense for a particular collection. Some of these methods include: Where, Select, GroupBy, OrderBy

, etc. Each of these methods is an extension method (5). These extension methods use iterators under the hood. Yes, if you were to decompile any of these methods you would see good ol' yield return within its code. When you chain together one or more of these methods, each item returned from the yield return actually passes from one method to the next before the next item is returned from the original collection. This is due to the behavior of yield return. This behavior is what gives LINQ so much power--like I said earlier:  a supercharged foreach.

等等。这些方法都是扩展方法(5)。 这些扩展方法在后台使用迭代器。 是的,如果您要反编译这些方法中的任何一种,您将在其代码中看到良好的收益回报 。 当您将一个或多个这些方法链接在一起时,从yield return返回的每个项目实际上从一个方法传递到下一个方法,然后收益率回报的行为。 这种行为就是赋予LINQ如此强大的能力的原因-就像我之前说的:增压的foreach

When you begin to think of your LINQ queries in this way, they become easier to understand--both in reading and writing such queries. Likewise, if you decide to use extension method syntax, you will understand why your method chains behave the way they do. Thinking of the query as an elaborate foreach loop helps you understand that something like this:

当您开始以这种方式考虑LINQ查询时,它们在阅读和编写此类查询时变得更容易理解。 同样,如果决定使用扩展方法语法,您将理解为什么方法链的行为方式如此。 将查询视为精心设计的foreach循环可帮助您了解如下信息:

var query = from line in System.IO.File.ReadAllLines("someFile.txt")
            where line.StartsWith("some text")
            select line;
            
Dim query = From line In System.IO.File.ReadAllLines("someFile.txt") _
            Where line.StartsWith("some text") _
            Select line

yield return), but that is a shortcoming of the ReadAllLines method, not the LINQ query.

yield return ),但这是

There is also a good bit of power in using the extension method syntax. A good portion of those methods have an overload which takes a predicate (6), which I will cover in a separate article. In short, a predicate is just a condition. Think of it like a "where" clause, but written in a slightly different way. With predicates, you can greatly affect the execution of your queries by letting the query run behavior you dictate, not just some default behavior coded into the extension method. The predicate is a slave to the yield return of the iterator, but the relationship hinders neither the execution of the extension method nor the evaluation of the predicate.

使用扩展方法语法还有很多功能。 这些方法中有很大一部分都有一个带谓词(6)的重载,我将在另一篇文章中进行介绍。 简而言之,谓词只是一个条件。 可以将其视为“ where”子句,但编写方式略有不同。 使用谓词,可以让查询运行收益返回值的从属,但是该关系既不妨碍扩展方法的执行,也不妨碍谓词的求值。

摘要 (Summary)

虽然我总是试图使我的解释简短而甜美,但似乎从来没有这样解决过。 祝贺您取得了如此长的成就。 到目前为止,您应该对使LINQ如此强大且非常有用的基本概念有一个总体的了解。 尽管以上描述更适合LINQ-to-Objects,但是这些概念也可以应用于LINQ-to-XML和LINQ-to-SQL。 (当然,LINQ-to-SQL还有更多功能。)

If you wish to dig deeper into the underlying logic, then read up on the yield keyword and its uses. I did not cover yield break anywhere above, but if you understand what the break keyword does in normal loop usage, then you already have a basic understanding of what it does in an iterator (and you should quickly understand why methods like Take and First work the way they do.

如果您想更深入地了解底层逻辑,请阅读yield关键字及其用法。 我没有在上面的任何地方介绍yield break ,但是如果您了解break关键字在正常循环使用中的作用,那么您已经对它在迭代器中的作用有了基本的了解(并且您应该Swift理解为什么Take和First这样的方法可以工作)他们的方式。

I did not show examples of the Yield keyword in VB. This keyword should be new in Visual Studio 11. For the VB folks, you will have to implement IEnumerator when you want to create your own iterators as best I can tell.

我没有在VB中显示Yield关键字的示例。 在Visual Studio 11中,此关键字应该是新关键字。对于VB人士,您要创建自己的迭代器时,必须尽我所能实现IEnumerator

My articles are usually born out of some interesting or in-depth problem I have answered on the site. I will try to cover LINQ in more detail in future articles. Feel free to post a comment below to inquire about a particular LINQ topic for a future article. In the meantime, thanks for reading, and I hope you have a better understanding of LINQ and iterators and the "magic" you can achieve by using them.

我的文章通常是基于我在网站上回答过的一些有趣或深入的问题而得出的。 我将在以后的文章中尝试更详细地介绍LINQ。 请随时在下面发表评论,以查询有关将来的文章的特定LINQ主题。 同时,感谢您的阅读,我希望您对LINQ和迭代器有更深入的了解,并希望通过使用它们可以实现“魔术”。

资源资源 (Resources)

dotPeek - A .NET decompiler. This can be useful to see how the existing extension methods work. dotPeek-一个.NET反编译器。 这对于查看现有扩展方法的工作方式很有用。

参考资料 (References)

1。 (1. )

Introduction to LINQ LINQ简介

2。 (2. )

Iterators 迭代器

3。 (3. )

yield 让

4。 (4. )

LINQ Query Syntax versus Method Syntax LINQ查询语法与方法语法

5, (5. )

Extension Methods 扩展方法

6。 (6. )

Predicate 谓词

翻译自: https://www.experts-exchange.com/articles/10170/A-Beginner's-Guide-to-LINQ-Part-1.html

linq入门

你可能感兴趣的:(编程语言,python,java,大数据,人工智能)