李白的朋友王维

Amazon Neptune architectures for scale, availability, and insight

Ok, we can get started. Hello everyone. Thank you very much for coming along this morning, very start of the week. Uh my name is Ian Robinson and I'm a Principal Graph Architect at AWS. And in this session, I'll be talking about how you can take your Amazon Neptune applications to the next level in terms of scale availability and the insights you offer graph practitioners.

So start by talking about scaling, sometimes call it scaling for success. So if your application is successful, if it does the job that it was intended to do very well, then you're invariably going to attract more users and more queries, you may add features that require more complex queries and you may even introduce new workloads running against the same underlying data. So by that, I mean, you may have started with some online queries supporting a web application, but you then decide that you want to add some reporting or some lightweight analytics or you may want to use the very same data to train a machine learning model. So different workloads running against the same underlying data.

So all of these things are drivers for scaling. So I'm going to talk about a couple of the things that you should be looking out for that will help you identify your scaling needs. And then about the decisions you can make with regards to the resources that you assign to meet those needs. And in particular, I'll be talking about how you can determine whether Neptune Serve is a good fit for your workload. And then finally, having assigned the right resources to meet those workload needs. How do you ensure that they're utilized effectively? So that's the last part really of this scaling se section.

So a couple of things to be looking out for with an existing application. The first thing is the situation where all of the worker threads on your database instances are busy most of the time or all of the time servicing queries. So when that happens, new queries coming in will end up being queued up on the server. The server has a service iq and those queries coming in will sit on that queue for a while. So i can introduce some additional latency or some unpredictable latencies into your application. You know, a query may take just a few milliseconds to to execute once the next time round, it may take several seconds because it's been sitting behind other queries in that queue. So you can monitor the depth of that service iq on each instance using this main request que pending requests, cloudwatch metric. And if that is frequently always above zero, then it's likely you're sending more work to the cluster than it can deal with immediately.

If that's the case, then have a look at the cpu utilization metrics on those instances. And if those metrics are really high 80 90% or more, then it's likely that all of those worker threads are busy doing useful work. So this is probably a good indicator that you need to scale for more worker threads for more cpu if that cpu utilization is pretty low 50% or lower, all those threads are perhaps busy, but they're perhaps waiting for data to come back from storage. It may be that they don't have the data available to them that they need to processed computer result. So they're waiting for data to come back to storage. So there's a kind of network i owe.

So that brings me to the second thing that you should be looking out for and this is buffer cash churn. So this is the situation where your working set can't necessarily fit into main memory into the buffer cache on those database instances. So queries are always going to run fastest if the data that they need is already in my memory. But if it's not in memory, then neptune is going to have to go down to that separate managed storage layer to retrieve one or more data pages and it will bring them back into the buffer cache. But if that buffer cash is already full. When those pages come back, then it's going to have to evict some of the least recently used pages in the cash. And at that point, you're going to be experiencing a lot of buffer cash churn, ok?

And you can see that happening, you'll see the buffer cash hit ratio going down. And if it's frequently going below 99.9% then your buffer cash is probably smaller than your working set. And as that buffer cash hit ratio goes down, you'll often see the volume read iops going up because we're constantly going down to storage to get all of those additional data pages. So the impact here is very similar additional latencies from the application's point of view, it's waiting longer for a particular piece of work to be done. But there's also a cost implication because one of neptune's charging dimensions is the number of io operations. So if you're experiencing a lot of buffer cash churn, you're potentially, you know, racking up costs along that dimension because you're constantly going down to the storage layer.

So this is an indication that you may want to scale for a larger buffer cache for more memory. So having identified your scaling needs, what can you do about it? Well, you can scale up with larger instances, bigger instances give you more worker threads and the number of worker threads is equivalent to two times the number of virtual cp us on each instance and bigger instances will also give you a bigger buffer cash. And the size of the buffer cache is approximately two thirds the main memory available on each instance. But besides scaling up, you can also scale out for high reed workloads with additional read replicas. You can add reid replicas up to 15 reid replicas to a cluster.

Now you can do that manually via the console or the cl or the sdk or cloud formation template. But neptune also has an auto scaling feature that will add and remove reid replicas based upon cpu thresholds that you specify or based on a schedule that you supply. I actually think the best use of this feature is scheduled auto scaling if you have a variable workload, but you can predict when it's going to peak, you know, you have a workload that peaks every day around about the same time and then tails off around about the same time every day. Then you can use scheduled auto scaling to add read replicas to address the peak and then schedule their removal once the traffic has died off.

And then once you've decided whether you're going to scale up or out or do both have a look at the particular instance types or instance families available to you. So different instance types have specific features that help address particular workload needs. So for example, the r five d incidence types are in fact any of the incidence types that have d in the name that neptune supports. These all support an mvme based property look up cache. So this is in addition to the buffer cache and it's useful for workloads where you have queries that are frequently accessing lots of node and edge properties or rdf literals if you need a bigger buffer cash, but you don't necessarily need more worker threads. You're not seeing a lot of queuing but you are seeing a lot of buffer cash churn then look at the x two instances.

So these have a larger memory to virtual cpu ratio than their peers. In the other instance types, typically four times the amount of memory to the number of virtual cp us versus the the other instance types. And then finally, if you have a variable workload, workload where you experience peaks and then troughs, you know, very high throughput and then periods where you've got relatively low throughput or the cluster is is idle for long periods of time, then you should consider using surplus instances, service instances can scale dynamically in order to address those workload needs. And that's what i'm going to talk about next.

So as you may note, neptune surplus instances scale using what are called neptune capacity units. An ncu is equivalent to two gigabytes of ra m and the associated cpu and network. And an instance can scale as low as one ncu and as high as 128 nc us. And then you can further control that at the cluster level by specifying minimum and maximum values that will apply to all of the service instances in that cluster.

So the minimum value determines how quickly a new surplus instance or an idle surplus instance is going to scale up. And the maximum value allows you to control or cap the amount of capacity that will be assigned to each surplus instance. So these values apply on a per instance basis. And using that maximum value, you can effectively control costs, you know what the maximum spend is going to be given that maximum capacity.

So cus spurs along several different dimensions. cpu utilization, memory utilization and network throughput, effectively. What it's doing on a second by second basis is some of the stuff i was talking about a moment ago. It's looking at memory demands and cpu demands on that instance.

What i really want to talk about with regards to this slide are some of the situations in which you would choose not to use serverless. Ok. So if you have a large frequent write job, perhaps once a week, you bulk load many millions or billions of records into neptune. These very large right jobs can sometimes cause neptune to refresh or recalculate the dfe statistics. These are the statistics that are used by the query engine when it's planning a query and the same statistics that are used by the summary a pi to give you a summary of the graph. All right.

Now that recalculation for a very, very large data set can sometimes take quite a while. And if you've got a cus primary instance where this recalculation takes place, that instance will be at its maximum capacity for many hours or even days. And that's not a good use of servius. So for these very large right, repeated right jobs, i would recommend using a large provisioned instance even if you choose to use servius with your reid replicas.

Another thing don't use surplus if you've got very latency sensitive query requirements. Ok. So surplus instances will start scaling up in a matter of seconds as the traffic increases. But until they can acquire all of the capacity that they need in order to service that, that higher traffic, some of those requests coming in may end up on that service side queue. And if a worker thread can't acquire enough memory, whilst it's generating intermediate results, it will spill to disc. And both of these things can cause some additional latency for some period of time until that incidence has scaled up fully.

And then finally, don't use cus if you've got a very memory intensive workload, if the buffer cache requirement is greater than what can be provided by an eight x large instance. So 100 and 28 nc us is equivalent to an eight x large instance. And if you have need of a buffer cash larger than that then surplus is probably not for you.

But assuming that your workload isn't constrained in any of those ways. How can you tell whether serves neptune services is genuinely a good fit for your needs? Well, it's partly trial and error but there is a bit of method to it as well. So what we recommend is take an existing workload and look at the cpu utilization metrics for a period of time for a duration that represents all of the variability in that workload, all of the cyclical behaviors, things like that. If the area under that cpu utilization curve is less than or around about 50% of the total potential cpu utilization were that to be running at its peak for the entire duration. So if the area is less than 50% the area under the curve is less than 50% this is potentially a good candidate for candidate for servius.

And at that point, you can run an experiment, you can flip to to running surplus, run your workload for a few days, drive out all of that variability, all of those cyclical behaviors and then review the costs compare the cost of running surplus with the cost of running the provisioned instances.

So how can you estimate the costs of running serverless or calculate the cost of running server for just a few days? Well, we've produced a very simple little tool which is in our aws labs, github repository called the surplus cost evaluator so it's a command line tool and you would run it against the service instance that's been running for a few days and it will estimate how much that will cost you to run that incident for that period of time. The nice thing is the other thing that it does, it will try and recommend a provisioned instance that would have addressed those workload needs. Were it running at its peak? So it will recommend a provision instance for the peak part of that workload and it will give you the costs of having run that provision in instance as well. So now you can start compare costs and decide whether sur sur is appropriate for you.

So i actually ran some experiments in preparation for today. I've got three different variable workloads and they all peak at around 1500 requests a second and they're a mixture of reads and writes. The first one, i'm going to characterize as an office hours workload. So it peaks and it runs at its peak for 678 hours a day and then it drops off and it's a relatively low throughput workload for the remainder of the day, perhaps for, you know, another 18 hours, 16 hours, something like that. And it's idle at the weekends.

The second workload is what i call an international hours workload. So it's very similar, but now it's running at its peak for about 18 hours a day and it's idle only for around about six hours a day. And again, it's idle at the weekends.

And then the third one is an on off workload. It's constantly peaking, you know, peaks and troughs high throughput, low throughput and that's running constantly seven days a week.

So if you look at that first one, that office hours workload, and we look at that cpu utilization curve, it definitely looks that the area under that curve is way less than 50% of the total potential cpu utilization. So this seems a good candidate for candidate for servius.

So what i did then was flip to using cus, i simply provisioned the cus instance failed over to that instance and deleted my old provision instance without interrupting the workload and then ran this for a few days and then ran the surplus calculator. And it tells me that for that five day period, surplus cost me approximately 100 and $809

And then it's saying, well, you know, an equivalent provisioned instance, a good provisioned instance will be a two x large. This would help address the peak needs of that workload. And if you are running that for five days, that would cost just over 100 and $40.

And then if I adjust those figures for the seven day week, you know, the two days at the weekend where it's idle, we can see that servius is definitely a lot cheaper than provision for this particular workload so this seems to be a very, very good workload for servius.

The second one is that international hours workload. And as soon as I look at that utilization curve, I can see that it's way more than 50% of the total potential consumption. So right from the outset, I don't think this is a good candidate for surplus. I do think this is a good candidate for that scheduled auto scaling. So this is the kind of workload that, you know, 56 in the morning begins to peak and then drops off after 18 hours or so. And it does that repeatedly every day or five days a week. It's very easy to schedule the addition of re replicas and then to have those re replicas removed once that workload tails off and then finally that on off workload, i did exactly the same.

It looked to me as though the area under the curve was around about 50%. So it was worthwhile running the experiment. I ran it for a few days. I think i ran the cost evaluator for just three days worth of metrics. So the cost evaluator is using cloudwatch metrics to to generate those estimates. It says that serves cost me just over $30. uh the equivalent two x large incident would have cost just under $30.28 $29. So it's very, very close and just a small change in that workload, just a couple of hours more every day running at its peak, you know, that could potentially make service a lot more expensive versus provisioned.

So in this case, I'm slightly on the fence, but I'd probably stick with running provisioned for this particular workload. Ok. So that's identifying some of your scaling needs and looking at the kinds of database resources that you can provision to address those needs.

The next thing I want to talk about is how you can utilize those resources efficiently. Ideally, you want each of those instances that are dedicated to a workload to be to be doing roughly the same amount of work. You know, if one of them is underutilized, it's going to cost you for no appreciable benefit. So ideally, you want to be able to distribute the work across all the instances that are servicing a particular workload.

And what I want to talk about here is how you can scale graphs for multiple workloads. And I mean two slightly different things here when I talk about multiple workloads. The first is the situation where you've got different query and access patterns running against the same underlying data. So this is the situation in which you may have some online queries back in that web app. And then you introduce some reporting or some analytics or you decide to train a machine learning model. So very different query access patterns, but all touching the same data set, the same underlying data set.

The second meaning of multiple workload in this context is multiple tenants. So you may have multiple clients, multiple customers, each with their own discrete data set, their own discrete component or subgraph within that larger data set. They're all running all of those clients. Customers are probably running similar query access patterns but against very discreet data sets. Ok.

Now, irrespective of whether it's one or the other, the problems are pretty much the same, you've got multiple workloads that are all competing for resources within the cluster. And in many cases, the aggregate working set of all of those workloads is greater than you can support with a single instance. So not all of that data is going to fit into memory, you've got all of these different workloads competing for resources and they're all competing for the buffer cache.

And what you may see there is a lot of that buffer cash churn and that's going to be all of those problems around additional latencies and additional costs, all of those re operations against the underlying storage. And then on top of that, you get no query prioritization. So if they're all competing for resources, you may end up in situations where a really important but short running query is sitting in the queue behind a less important, less critical but long running query or is waiting on a long running query to complete.

So how can you ensure all of these database resources are employed efficiently when you have these multiple workloads? Well, there's a wanstead technique called cash sharding, which is effectively routing specific workload traffic to an individual replica or a set of replicas that are dedicated to servicing that workload. Ok. So that way, hopefully you end up with a smaller working set. You don't have a very large aggregate working set, you have a smaller working set. So it's more likely you can size instances and the buffer cash to accommodate that working set. So you get fewer cash evictions and then you can tune individual replicas or sets of instances according to the workload needs.

So you've got all of those choices around scaling up around choosing particular instance types, perhaps adopt the uh the the the look up cache or use servius. And you can do that on a workload by workload basis as long as you consistently route traffic belonging to a specific workload to a particular replica or set of replicas.

So how can you do that? I mean, you can imagine developing some reasonably complex application logic to do that routing or using load balances to route traffic appropriately. But the things that you need to think about are how are you going to accommodate changes in the cluster topology and your cluster is going to change, you're going to add re replicas, you may remove replicas. You may experience a situation where a replica is promoted to a primary that cluster topology is going to change. How do you ensure that you can accommodate those changes in all of your routing logic. All right, we don't really want you to spend time worrying about that.

Fortunately, there is something in neptune, an inbuilt feature today that can help you with that and that's custom end points. So custom end points are like the reader end point, but you get to decide the membership set and then the custom endpoint is going to balance connections across all of the instances that belong to that particular end points, membership set. All right, you get a couple of very simple controls in terms of defining how you create these custom end points. You can use include lists or exclude lists, but i could create a custom endpoint that only includes those two readers. I could create a custom endpoint that includes the primary and a reader. So i can utilize the primary. If it's under utilized for writing, i might want to include it for servicing some read requests.

Custom endpoints have the advantage that it's really simple to configure application. You simply need to configure application with the specific endpoint addresses for each of those endpoints and it works across all of the different query languages and data models. There are no special client drivers or anything like that that you need to configure. It's just gonna work irrespective of whether you're using property graph or rdf open cipher gremlin or sparkle.

There are some downsides as well. So custom endpoints suffer from the same problem that reader endpoints suffer from which is connection swarms. So the way these things are implemented, the ip address to which that endpoint resolves changes every five seconds. And if you open a large connection pool and open all those connections around about the same time, they're all going to get locked on. Typically, they get locked on to a single instance. And then because they're long live connections, you stay locked on to that instance for a very long period of time, tens of minutes hours even. Ok. So you can often end up in this situation where all of your traffic, despite having an end point that's pointed in many instances, all that traffic is just going to one instance, right?

So the thing to do here is recycle connections, recycle connections every 10 seconds or so. And if possible turn off or reduce the time to live in any dns caches, and that way you'll get a more even distribution of traffic across the instances, whether you're using the reader or a custom end point, some of the downsides of custom end points, they're very simple to configure, but actually, it's not very flexible. It's not very powerful for creating application meaningful semantics around the different end points that you want to employ. All you've got are these include lists and exclude lists.

So you actually have to explicitly add specific instance i ds to one or other of those lists and it's still difficult, a little bit difficult to ensure that these end points properly adapt to changes in the cluster topology. So for an instances role changes, if a reader is promoted to primary, it will still remain a member of any of those end points that it previously belonged to, even if that wasn't your intention.

So to address some of those, those issues, we've created a very simple little package called dynamic custom end points. So it's using custom end points, but it's a cloud formation template that will install a lambda function and a scheduler in your account. And then it gives you the ability to create these very rich declarative specifications that allow you to describe your membership set and every minute or so this package, the lambda function will update the custom end points in your cluster.

So here's an example specification. Um this is saying that any instance that is in the role of reader and that he is tagged, either b or reporting is a member of this particular custom end point. Now, there are lots of different ways in which you can specify the membership. So you can use things such as availability zones, instant types, instant sizes, uh the names of particular instances. But i think the most powerful way of using it is using tags, ok? Because you can now control which instances are added to and removed based on adding and removing tags.

So it may be that you can hide an instance from a custom end point for a while and then tag it and the next minute it will be picked up and included in that custom end point. So tagging here gives you a very nice application meaningful way of defining the members or the membership set for a custom endpoint.

So given that that cluster topology, there, those three instances, two of which are tagged b i and one of which is tagged reporting will all be considered members of that group. And if i were to remove one of those tags, remove one of those b i tags within a minute, the membership set for that custom endpoint will have adjusted automatically so that it only includes two readers.

So this is your, your normal set up. You've got a client or an application sitting in the, the neptune vpc. When you're in the cloud formation template, you just get a very simple lambda function and a scheduler, you supply adjacent document to the configuration for that lambda function. That's your specification of all the different custom end points. The scheduler triggers the lambda function every minute and the lambda function goes to the management api gets the cost of topology and then evaluates it against those specifications and then it decides whether it needs to create or update existing custom end points.

So that's scaling what i want to talk about next is availability. And in particular, i want to talk about how you can reduce downtime and increase the availability of your application and your database during neptune engine updates.

So neptune frequently releases engine updates in the form of major version changes, minor version changes and patch releases the major and the minor versions are optional, which means that you can decide to stay on the version of the engine that you're currently on for as long as you want up until that version is deprecated.

So earlier this year, we deprecated the last of the 1.0 engines. And at the end of january next year, january 2024 we'll be deprecating the 1.10 engine, ok? So major or minor versions are optional, patch releases are mandatory.

So after a patch is made available, there's a 14 day grace period in which you can choose to apply that patch yourself at any time that's relevant to you or that suits you. But if you don't apply it within that 14 day grace period, then the patch will be automatically applied during the next maintenance window.

Now, up until version 130, which we released just a couple of weeks ago, the difference between minor versions and patches was a little bit vague and often new features that ideally should have gone into a minor version release were actually introduced by way of a patch. But with 130 onwards, we're being a lot stricter about this separation and only critical bug fixes and security and operating system patches are going to go into those uh security fixes will go into those, those patches.

Now, irrespective of the type of update, whether it's major minor or a patch, there is going to be some downtime. So they all require restarting instances. If it's a patch or a minor version update, the downtime is probably going to be in the order of several tens of seconds or perhaps a couple of minutes. But if it's a major version change, the downtime could be many tens of minutes or even more.

So some of those major version upgrades have involved changes to the underlying storage format. If you've got a very large storage volume, those changes can take many tens of minutes, perhaps an hour or more. And during that time, your cluster isn't available.

So this is obviously a bit of an issue and we've spoken to many customers who've asked are there ways in which we can mitigate or minimize this downtime, particularly for those major version updates

So that's why we released earlier this year, we released Neptune Blue Green Deployment which allows you to clone a blue production cluster, clone it to a green cluster where it's upgraded in place and then migrate your application to that green cluster once you're happy that it's all working well and minimize the downtime. So even for major version updates, Blue Green Deployment results in a downtime of probably just a couple of minutes several minutes or so, right.

So it's built on a couple of native Neptune features including fast database cloning and Neptune Streams and it's installed via a CloudFormation template. And once that template has been installed, it runs automatically and it will begin to clone the blue production cluster to the green. It will upgrade the green cluster and then it will replicate any changes that have taken place in the interim from the blue to the green. And all this time, your application is continuing to interact with that blue cluster. So it's only at the very end where there's a small amount of downtime when you flip from blue to green.

So there are a few things that you need to do and a few things that you need to think about before you can use Blue Green Deployment. First of all, you need to ensure that Neptune Streams is enabled on your blue production cluster. If it isn't, that's a cluster parameter change and you will need to restart the instances, but that's just a matter of a couple of seconds, you also need to ensure that you have a DynamoDB VPC End Point in that Neptune VPC. And the reason for that is the migration, the replication part of the the migration uses DynamoDB to checkpoint its progress.

But once the migration is complete, if you uninstall that CloudFormation stack, then a lot of the ancillary assets such as that DynamoDB table a little EC2 instance that's been managing the process, these are all deleted on your behalf. And there are a few things that you need to think about. You need to choose a period of relatively low right traffic. So you can continue your application working against the database whilst all this is taking place. But if you've got really, really high right throughput during that period, there's a chance that that green cluster will never catch up with the blue when we're trying to replicate those changes. So try and identify a period of relatively low traffic and then you're going to need to think about how you're going to manage the change over at the very end towards the end of the process, you're going to have to pause the rights on the blue cluster just to allow the last few transactions to replicate over. And then you're going to have to configure your application to switch from using the blue to the green cluster because this is a new cluster with new endpoints, new dns addresses. So somehow you will need to ensure that you can configure application to direct the traffic from one to the other at the end of the process.

So what does this look like in practice, I'll just quickly run through. So here we've got a blue cluster. It's on 1.10. That's that version that will be deprecated at the end of January next year. It's a primary instance with three re replicas. So the first things I do, I ensure that Neptune Streams has been enabled on that cluster and ensure I've got a DynamoDB VPC end point, then I can install the CloudFormation template. I get to specify the target cluster ID. So the name of the target cluster and this name will be incorporated into all of those endpoint addresses later on. So you get to specify the name of that that green cluster, which is the source cluster. And we're also specifying the target engine version.

And then once that's up and running the migration begins automatically, the first thing it does is clone the blue cluster. So it uses Neptune's fast database cloning feature to create a copy of the database. And that green cluster will contain a copy of the database, a copy of the data in the blue cluster at the point in time that that cloning was initiated. All right. So we've got a green cluster with most of the data already in it at this point in time. And the process at this point also makes a note of the very last transaction ID that had been applied to that green cluster. And it's going to use that a little later on in the process. When we begin the replication to catch up, it will be able to resume applying transactions from the transaction immediately after this ID.

So at this point, you can start monitoring the migration via CloudWatch logs. So here there's lots of details about the cloning that's taking place. Um other things that are being copied over, things like security groups, IM roles tags and all of the configuration. All right. So it's making a aaa complete copy of the source. Once that cloning is complete, the process then begins to upgrade the green cluster in place and it may go through several intermediate upgrades. And if there are some major version upgrades, you know, again, some of those may take many tens of minutes, but that's not really an issue for your application because your application is still pointed to the blue cluster. It's still working very happily against that blue cluster.

Once we've upgraded to the target version number, the process then adds in any necessary read replicas. So up until this point, it's just been a single instance. But our source here had three rev replicas. So we add those in and again, we're supplying the same configuration, security groups tags, things like that. And then the last part of the process is to begin catching up applying any additional transactions, any additional rights that have taken place on the blue cluster in the intervening period. And all of those will have been captured as change, data capture records in that Neptune stream.

So all the process needs to do takes that last transaction ID, that it made a note of a little earlier on. And it looks for the point in the stream immediately following that transaction and then begins to apply all of those changes. And even even at this point, you can still be writing to the blue cluster and those changes will make their way through by way of that Neptune stream. And at this point, you need to be watching the logs a lot more closely.

So every few seconds, there'll be a couple of records that are emitted into those logs. And the second one here is the most important one that I've highlighted in green, this stream event ID difference. So this reference represents the difference between the last transaction that's been applied on the green cluster and the transactions that are still pending, that's still sitting in the stream waiting to be applied. And ideally over time, this number is going to be coming down, it's going to be reducing if it's not or if it starts going up. That probably means you've got too much white traffic on the blue cluster, the green cluster is never gonna catch up.

So in those kind of situations, it may be that you have to somewhat prematurely pause or throttle rights on the blue cluster. So as to give the green a chance to catch up, but in many situations, you know, that number is just going to start coming down and you're really looking to anticipate a time when it's going to get very, very close to zero. And that's the point in time where you're going to choose to flip from the blue cluster to the green cluster.

Now, whilst all this is taking place, you can be qualifying and testing and reviewing the green cluster, don't perform any additional rights against it only do that against the blue cluster. But other than that, you can, you can test that your application is working fine with it. And then at the very end of the process, at a point in time where you're happy with this, you pause rights on your blue cluster. And the way that you do that is going to be very dependent upon your application. It may be that you can buffer rights in a stream or a queue or you can use back pressure or reconfigure your application to ensure that no rights make it all the way down to that, that that blue database, but you need to be able to pause those rights.

So the last few transactions finally drain down, they're finally applied to that green cluster. And at that point, you're good to switch all right. At that point, this difference will have come all the way down to zero. So now you can switch over. So you need to be able to configure your application to use the new end points from the green cluster. And at that point, both read and write traffic can be running against that green cluster and you successfully migrated at some point, then you can choose to delete that CloudFormation stack that's going to remove all of those ancillary assets. Things like the DynamoDB table, there's a small EC2 instance that's managing the process, all of these things will be removed. And then later on, once you're happy that there's no need to roll back, you can finally remove that blue cluster. So you've successfully migrated from 110 up to 1202.

And then the last thing that I want to talk about today is the way in which you can help graph practitioners derive better insights from your connected data. And by graph practitioners, I mean application developers, data engineers, data scientists and analysts. So I'm going to talk about some of the practitioner tooling that we've built over the last few years and how that's all come together. And then I've just got a little kind of future oriented bit talking about some of the opportunities that we're seeing emerge when you marry graphs with generative AI.

So over the last few years, the Neptune team have developed several pieces of open source software that they've contributed back to the graph community. A couple that I want to talk about are the Graph Notebook and the Graph Explorer. So these are both Apache two open sourced pieces of software and they both work with multiple different graph databases. So different Gremlin Server implementations, Bras Graph Amazon Neptune obviously Neo4j so multiple different graph databases and, and these tools work with all of them.

The Graph Notebook is a Python package that provides Jupiter notebooks with cell and line magics that make it really easy to interact with your database and to write queries and to visualize the results of those queries and to tune those queries without any other boilerplate code. So you get to focus just on writing the query and visualizing the results. You can review the explain plans if you're running against Neptune and you can use those to further tune the query. But there's no other application code that surrounds you. You're not worrying about any other boilerplate code.

The Graphic Explorer provides you with a browser based no code visualization environment. So it allows you to search and visualize both property graph data and RDF data. So the graph notebooks ideally suited for practitioners who like or need to write queries, but they don't want to write any other kind of boilerplate code. They just want to focus on writing those queries and potentially visualizing the results. The Graph Explorer is suited for users who don't want to write any queries at all. They just want to search for and filter and expand parts of their network via the visualization in the browser and then export or save the results of their work to a file.

So they're both open source pieces of software that work with multiple different graph databases. The n notebooks provide you with a fully managed ID for Neptune that combines query authoring application development and no code visualization all in one environment. So the Neptune notebooks include the Graph Notebook and the Graph Explorer, but they also include the AWS SDK for Python. So that's the Boto three libraries, the SDKs, the APIs that you can use to interact with any other AWS service. And they also include the AWS SDK for Pandas, which is another piece of open source software that allows you to write complex ETL and machine learning tasks and run them against different AWS data and analytic services.

So you've got a whole package of of libraries here whenever you provision a notebook environment, so you can provision a notebook via a CloudFormation template or via the console. The notebooks themselves are actually hosted by Amazon SageMaker. So you're effectively getting a fully managed hosted Jupiter or Jupiter Lab environment. But besides that, you can also take advantage of all those other SageMaker features. So lots of other machine learning features that you can utilize by way of the Neptune notebooks.

When you create the notebook from the console, it's automatically configured for your cluster and that includes things like IM database authentication. So it's going to be very easy to connect to your cluster and start to interact with it. So using an ETSY notebook, for example, I can open the explorer for this quick and easy visualization of my data. I can begin to explore that data, save the results to file once I'm happy with having discovered the, the stuff that's of interest to me or I can author queries using the notebook magics. The Jupiter magics just focus on writing those queries, tuning those queries and visualizing the results.

So here I've written a couple of queries, one in Gremlin and one in OpenCypher, but against the same underlying data. And if you've got an RDF data set, you can do exactly the same with SPARQL. All right

So really at this point, I'm focused on query authoring. And then the last part of the puzzle was added in just the last few weeks. So this is the ability to be able to write application code that queries or interacts with Neptune.

So that AWS SDK for Python Boto3 now includes the Neptune Data API, a new data SDK that we released just a few weeks ago that includes over 40 different data operations. So these include things such as loading data, bulk loading data, running different Neptune machine learning jobs or tasks. You can use it to get the engine details and the graph statistics, the summary statistics.

But most importantly, you can use the Data API now for querying Neptune. So today it supports openCypher and Gremlin, but we've got SPARQL support coming very, very soon. And the nice thing about this is, you know, it's just a standard AWS SDK. It really simplifies application development. It deals with all of the connection management. It deals with signing the requests, it deals with interacting with an IAM enabled database without you really having to do any special configuration.

So if you were using some of the open source drivers, open source libraries for interacting with Neptune, you may know that there was quite a lot of configuration, tuning, connection management and so on that you had to do there. All of that is taken care of in the Neptune Data API SDKs.

So how would you use these things together? What might be the typical workflow? Well, if you imagine your task with writing an AWS Lambda function that queries Neptune, what I might do is begin to focus first of all on authoring a query using those cell magics. So I just get to focus on the query. I can review the explain plan. I can visualize the results. I can iterate on that query until I'm absolutely happy that I've got the right query for the task at hand.

At that point within that notebook environment, I can then port it. I can develop some application code using the Data API, embed that query in the application code. But now I can introduce all that additional application logic. Perhaps there's some preprocessing I want to do or some work that I want to do once I've got the results back from Neptune. So I can flesh out all of that application logic. And then once I'm happy with that, it's a very easy matter to pour that over to an AWS Lambda function handler so that some of the productivity tools.

The very last topic for today is graphs and generative AI. So over the last few months, we've identified several different ways in which you can use generative AI to benefit your application development and derive better insights from your data. And there are probably three different high level use cases.

The first is to use generative a large language model to write or author a graph query. So you give it a natural language question and it gives you back a query in Gremlin or openCypher or SPARQL you get to to specify. So this can be useful during the application development process. I'm an application developer, I just want you know some starting point for authoring a query. I could perhaps use an LLM to help me begin authoring that query. But I could also choose to incorporate some of this if I'm developing a chatbot or an assistant.

The second use case is using an LLM again to design a graph application data model. So again, I provide a natural language description of my domain and the kinds of things that I want to ask of that domain. And I can guide the LLM to describe for me a good working graph data model and I can even ask it to provide me or to create for me some sample data so that I can begin to test it again perhaps in one of those notebook environments.

And in fact, my colleague Michael Hay has recently released a blog post. It's on the the AWS Database blog about using generative AI to create application data models to create graph data models.

And then the third use case is retrieval augmented generation. So this is the ability to be able to enrich the results or the power of a large language model by invoking an external source of data. So it may be that a large language model doesn't know how to describe all of the routes that connect Austin with London, all of the air routes that connect Austin with London. But if I can give it access or guide it by giving access to a structured representation of all those air routes, a knowledge graph, for example, then the large language model can help answer some detailed and interesting questions around that specific use case, for example.

So how might all of this work? So I've got a very simple example of how this might work we're going to need. Well, we've got our source of data, our database, our graph that's Amazon Neptune, but we're going to need a large language model.

So in this use case, we're going to be using Amazon Bedrock, which is a managed service which gives you access to foundation models from companies such as Anthropic and Cohere. The model we're going to be using here is Anthropic Claude v2 model which is an AI assistant. That's really good at text summarization, question and answer, generating content, retrieval augmented generation and even some programming tasks. So it's it's a good fit for some of the things that we want to do here.

So we've got Bedrock and Neptune. The last part of the architecture here is LangChain. So this is an open source framework, open source piece of software that makes it really easy to build generative AI applications and effectively what LangChain is doing is brokering interactions between the large language model in Bedrock in this instance and the external source of data Neptune.

So how would this work? Well, our user supplies a natural language question. You know how many, how many external outgoing routes are there from the city of Austin, from the airport in the city of Austin? So they submit a question, LangChain takes that question it queries Neptune to get a representation of the graph schema. So it just wants some simple representation of these are the kinds of nodes and edges and the properties that are contained within the graph.

And it's going to take that question and the schema and it's going to generate some additional prompts. It's going to do a little bit of prompt engineering and it's going to submit all of that to the large language model. All right. So it's effectively going to ask the large language model. Can you give me a query in this case in openCypher that satisfies this question? And here are some details of the graph that I want you to query against.

The large language model Claude in this instance, is going to generate an openCypher query and return it to LangChain. LangChain then runs that query against Neptune. So you can see LangChain's brokering all these interactions between the large language model and Neptune. It runs that query against Neptune gets the results.

And then again, it submits those results to Claude to the, the LLM. And again, it's doing a little bit of prompt engineering, but the LLM is going to give us a natural language representation of those results and we're going to return that response to the user.

So if you're using one of the latest Neptune notebooks, there's not a lot that you actually need to do to set this up. Most of the prerequisites are already in place. All you need to do is install LangChain. So it's just a matter of pip install langchain and then you need to update the notebooks IAM role so that it can invoke that Claude v2 model in Bedrock. So that's, that's all the prerequisites that we need to set up.

And then the code itself is relatively simple. So here we're using LangChain to create what we're calling NeptuneGraph. So this is just something that's going to is a connector really to Neptune. So we're supplying the host and the port and then we're creating a Bedrock client and we're taking that NeptuneGraph and that Bedrock client and we're wiring them up together using this Neptune openCypher QA chain. And this gives us back a chain object and this chain object is what we're then going to use to submit a question, get the results.

So this is this is the way in which we're going to interact with all of those different moving parts of the architecture. So you can see using that chain object, I can supply a question. How many outgoing routes does the Austin airport have? And then you can see in the notebook, the cipher the openCypher query that is being generated by Claude, you can see the results of running that against Neptune.

So we get back some details from Neptune. Those results are being handed back to Claude to give us a a natural language response. And then finally, that response is based on the information provided the Austin airport has 93 outgoing routes. Ok. So it works really well for nice simple questions.

The more complex questions, things begin to break down. And this is this is common across all of the models that you may be working with. So the work today is probably indicative of future possibilities. If you ask more complex questions, you may get a valid query, you may get a valid openCypher query, but it may generate nonsensical results, it may run, but the results are not appropriate to answering the question.

And in some cases, we've seen examples where Claude tries to generate or the large language model tries to generate a query, but it injects functions or keywords that just don't exist, although they aren't part of openCypher. Ok. But over time, we expect these things to improve enormously. So this is really just indicative of future possibilities.

There are ways of improving the kinds of responses that you get. You can, whilst LangChain is doing some of that prompt engineering, you can actually supply additional prompts that will further help guide the model to produce a meaningful query or a meaningful response. And this document from Anthropic uh is very good introduction to how you can do your own prompt engineering to to help improve the kind of results that you're getting from, from working with a large language model.

So if you're interested in this, I mean, that's just a very brief introduction to some of the work that we've been doing. We've got a workshop this afternoon in the MGM Grand getting started with Neptune LLMs and LangChain. I think it is fully booked up, but you can always queue and see whether there's a no show or something like that.

And then throughout the rest of the week, we've got lots and lots of other Neptune sessions. So there are sessions where you can get hands on or, or see other people getting really hands on with code running against Neptune. There are sessions where we're going to talk about some of the more recent features that are going into Neptune.

And then if you're interested in meeting some of the developers who are working on some of the open source tooling, I think they're going to be in the Expo Hall on Thursday giving some demos of some of that stuff and that's a great chance to, to talk to them and, and, and ask them about some of that material.

And then finally, there are lots of resources, new resources that we've, we've put out this year that can help as you're beginning to architect your applications and as you're beginning to scale them out. So we've produced some Well Architected guidance for Neptune that was published recently in the documentation.

Uh we have a very deep dive data modeling course that's freely available through Skills Builder. I think that's a property graph data modeling course, but uh there's a lot of depth to that. And then we're frequently publishing lots of blogs on the AWS Database blog. And that top one there is that one from Michael Hay around using generative AI to build a data model for Amazon Neptune.

So that's it. Hopefully, that was very useful. Hopefully there's something you can take away there in terms of scale availability or the way in which you might use those, those those productivity tools. Thank you very much for coming along today. Um and I hope you have a really good rest of the week. Thank you.

你可能感兴趣的:(aws,亚马逊云科技,科技,人工智能,re:Invent,2023,生成式AI,云服务)

guava loadingCache代码示例 IM 胡鹏飞 Java 工具类介绍
publicclassTest2{publicstaticvoidmain(String[]args)throwsException{LoadingCachecache=CacheBuilder.newBuilder()//设置并发级别为8，并发级别是指可以同时写缓存的线程数.concurrencyLevel(8)//设置缓存容器的初始容量为10.initialCapacity(10)//设置缓存
系统学习Python——并发模型和异步编程：进程、线程和GIL
分类目录：《系统学习Python》总目录在文章《并发模型和异步编程：基础知识》我们简单介绍了Python中的进程、线程和协程。本文就着重介绍Python中的进程、线程和GIL的关系。Python解释器的每个实例都是一个进程。使用multiprocessing或concurrent.futures库可以启动额外的Python进程。Python的subprocess库用于启动运行外部程序（不管使用何种
为什么会出现“与此站点的连接不安全”警告？
当浏览器弹出“与此站点的连接不安全”的红色警告时，不仅会让访客感到不安，还可能直接导致用户流失、品牌信誉受损，甚至引发数据泄露风险。作为网站运营者，如何快速解决这一问题？一、为什么会出现“与此站点的连接不安全”警告？浏览器提示“不安全连接”，本质上是检测到当前网站与用户之间的数据传输未经过加密保护。以下是触发警告的常见原因：1.未安装SSL证书SSL（SecureSocketsLayer）证书是网
有必要获得WHQL测试认证吗，有什么好处？
什么是WHQL认证？WHQL是MicrosoftWindowsHardwareQualityLab的缩写，中文意思是Windows硬件设备质量实验室，主要是对Windows操作系统的兼容性测试，检验硬件产品和驱动程序在windows系统下的兼容性和稳定性。当某一硬件或软件通过WHQL测试时，制造商可以在其产品包装和广告上使用“DesignedforWindows”标志。该标志可以证明硬件或软件已经
驱动程序为什么要做 WHQL 认证? GDCA SSL证书网络协议网络
驱动程序进行WHQL（WindowsHardwareQualityLabs）认证的核心价值在于解决兼容性、安全性和市场准入三大关键问题，具体必要性如下：️‌一、规避系统拦截，保障驱动可用性‌消除安装警告‌未认证的驱动在安装时会触发Windows的‌红色安全警告‌（如“无法验证发布者”），甚至被系统强制拦截。通过WHQL认证的驱动获得微软数字签名，用户可无阻安装‌。满足系统强制要求‌Windows1
求是网：“内卷式”竞争的突出表现和主要危害有哪些？加百力财经研究科技知识人工智能大数据
"内卷式"竞争主要表现为：企业层面的低价竞争、同质化竞争和营销"逐底竞争"；地方政府层面的违规优惠政策、盲目重复建设和设置市场壁垒。危害体现在三个层面：微观上导致"劣币驱逐良币"，损害消费者利益；中观上破坏行业生态，挤压产业链利润空间；宏观上扭曲资源配置，抑制创新活力。什么是“内卷式”竞争？概括其一般特征，是指经济主体为了维持市场地位或争夺有限市场，不断投入大量精力和资源，却没有带来整体收益增长的
WHQL签名怎么申请 GDCA SSL证书 windows
WHQL（WindowsHardwareQualityLabs）签名是微软对硬件和驱动程序进行认证的一种方式，以确保它们与Windows操作系统的兼容性和稳定性。以下是申请WHQL签名的基本步骤，供您参考：1.准备阶段准备硬件设备和驱动程序：确保您的硬件设备已经准备好，并且对应的驱动程序已经经过充分的测试，能够在各种配置和环境下正常工作。获取EV代码签名证书：根据微软的要求，驱动程序进行WHQL认
C++ 11 Lambda表达式和min_element()与max_element()的使用_c++ lamda函数 min_element(
网上学习资料一大堆，但如果学到的知识不成体系，遇到问题时只是浅尝辄止，不再深入研究，那么很难做到真正的技术提升。需要这份系统化的资料的朋友，可以添加戳这里获取一个人可以走的很快，但一群人才能走的更远！不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人，都欢迎加入我们的的圈子（技术交流、学习资源、职场吐槽、大厂内推、面试辅导），让我们一起学习成长！intmain(){vectormyvec{3,
【LeetCode 热题 100】24. 两两交换链表中的节点——（解法一）迭代+哨兵 xumistore LeetCode leetcode 链表算法 java
Problem:24.两两交换链表中的节点题目：给你一个链表，两两交换其中相邻的节点，并返回交换后链表的头节点。你必须在不修改节点内部的值的情况下完成本题（即，只能进行节点交换）。文章目录整体思路完整代码时空复杂度时间复杂度：O(N)空间复杂度：O(1)整体思路这段代码旨在解决一个经典的链表操作问题：两两交换链表中的节点(SwapNodesinPairs)。问题要求将链表中每两个相邻的节点进行交换
Guava LoadingCache sqyaa. java并发编程 Java知识 jvm 缓存 guava
LoadingCache是GoogleGuava库提供的一个高级缓存实现，它通过自动加载机制简化了缓存使用模式。核心特性自动加载机制当缓存未命中时，自动调用指定的CacheLoader加载数据线程安全：并发请求下，相同key只会加载一次灵活的过期策略支持基于写入时间(expireAfterWrite)和访问时间(expireAfterAccess)的过期可设置最大缓存大小，基于LRU策略淘汰丰富的
JavaScript 树形菜单总结 Auscy microsoft
树形菜单是前端开发中常见的交互组件，用于展示具有层级关系的数据（如文件目录、分类列表、组织架构等）。以下从核心概念、实现方式、常见功能及优化方向等方面进行总结。一、核心概念层级结构：数据以父子嵌套形式存在，如{id:1,children:[{id:2}]}。节点：树形结构的基本单元，包含自身信息及子节点（若有）。展开/折叠：子节点的显示与隐藏切换，是树形菜单的核心交互。递归渲染：因数据层级不固定，
基于定制开发开源AI智能名片S2B2C商城小程序的社群游戏定制策略研究说私域人工智能小程序游戏
摘要：本文聚焦社群游戏定制领域，深入探讨以社群文化和用户偏好为导向的定制策略。通过分析互动游戏活动、社群文化塑造等关键要素，结合定制开发开源AI智能名片S2B2C商城小程序的技术特性，提出针对性游戏定制方案。研究旨在提升社群用户参与度与游戏体验，为社群游戏发展提供理论支持与实践指导。关键词：社群游戏定制；定制开发开源AI智能名片S2B2C商城小程序；社群文化；用户偏好一、引言在数字化社交蓬勃发展的
嵌入式系统LCD显示模块编程实践
本文还有配套的精品资源，点击获取简介：本文档提供了一个具有800x480分辨率的3.5英寸液晶显示模块LW350AC9001的驱动程序代码，以及嵌入式系统中使用C/C++语言进行硬件编程的实践指南。该模块的2mm厚度使其适用于空间受限的便携式设备。内容包括驱动程序源代码、硬件控制接口使用方法，以及如何在嵌入式系统中进行图形处理、电源管理与性能优化。1.嵌入式系统原理1.1嵌入式系统概念嵌入式系统是
深入剖析OpenJDK 18 GA源码：Java平台最新发展想法臃肿
本文还有配套的精品资源，点击获取简介：OpenJDK18GA作为Java开发的关键里程碑，提供了诸多新特性和改进。本文章深入探讨了OpenJDK18GA源码，揭示其内部机制，帮助开发者更好地理解和利用这个版本。文章还涵盖了PatternMatching、SealedClasses、Records、JEP395、JEP406和JEP407等特性，以及HotSpot虚拟机、编译器、垃圾收集器、内存模型
Android 开源组件和第三方库汇总 gyyzzr Android Android 开源框架
转载1、github排名https://github.com/trending,github搜索：https://github.com/search2、https://github.com/wasabeef/awesome-android-ui目录UIUI卫星菜单节选器下拉刷新模糊效果HUD与Toast进度条UI其它动画网络相关响应式编程地图数据库图像浏览及处理视频音频处理测试及调试动态更新热更新
FPGA小白到项目实战：Verilog+Vivado全流程通关指南（附光学类岗位技能映射）阿牛的药铺算法移植部署 fpga开发 verilog
FPGA小白到项目实战：Verilog+Vivado全流程通关指南（附光学类岗位技能映射）引言：为什么这个FPGA入门路线能帮你快速上岗？本文设计了一条**"Verilog语法→工具链操作→光学项目实战→岗位技能对标"的阶梯式学习路径。不同于泛泛而谈的FPGA教程，我们聚焦光学类产品开发**核心能力（时序接口设计、图像处理算法移植、高速接口应用），通过3个递进式项目（从LED闪烁到图像边缘检测），
docker-compose方式搭建lnmp环境——筑梦之路筑梦之路 linux系统运维国产化 docker android adb
docker-compose.yml文件#生成docker-compose.ymlcat>docker-compose.ymlnginx/conf.d/default.conf">www/index.phpecho"开始启动服务..."docker-composeup-d#获取本机ipip_addr=$(hostname-I|awk'{print$1}')echo"部署完成！"echo"访问测试页
ARM嵌入式可编程控制器技术开发拉勾科研工作室 arm开发
PLC自动化设计|毕业设计指导|工业自动化解决方案✨专业领域：PLC程序设计与调试工业自动化控制系统HMI人机界面开发工业传感器应用电气控制系统设计工业网络通信擅长工具：西门子S7系列PLC编程三菱/欧姆龙PLC应用触摸屏界面设计电气CAD制图工业现场总线技术自动化设备调试主要内容：PLC控制系统设计工业自动化方案规划电气原理图绘制控制程序编写与调试毕业论文指导毕业设计题目与程序设计✅具体问题可以
Android ViewBinding 使用与封装教程积跬步DEV Android 开发实战大全 android
AndroidViewBinding使用与封装教程：一、ViewBinding是什么？核心功能：为每个XML布局文件自动生成一个绑定类（如ActivityMainBinding），直接暴露所有带ID的视图引用。优点：避免繁琐的findViewById()，类型安全且编译时检查。对比DataBinding：ViewBinding仅处理视图引用，无数据绑定功能。DataBinding支持双向数据绑定，
Java大厂面试实录：谢飞机的电商场景技术问答（Spring Cloud、MyBatis、Redis、Kafka、AI等）
Java大厂面试实录：谢飞机的电商场景技术问答（SpringCloud、MyBatis、Redis、Kafka、AI等）本文模拟知名互联网大厂Java后端岗位面试流程，以电商业务为主线，由严肃面试官与“水货”程序员谢飞机展开有趣的对话，涵盖SpringCloud、MyBatis、Redis、Kafka、SpringSecurity、AI等热门技术栈，并附详细解析，助力求职者备战大厂面试。故事设定谢
【超硬核】JVM源码解读：Java方法main在虚拟机上解释执行 HeapDump性能社区 java 开发语言后端 jvm
本文由HeapDump性能社区首席讲师鸠摩（马智）授权整理发布第1篇-关于Java虚拟机HotSpot，开篇说的简单点开讲Java运行时，这一篇讲一些简单的内容。我们写的主类中的main()方法是如何被Java虚拟机调用到的？在Java类中的一些方法会被由C/C++编写的HotSpot虚拟机的C/C++函数调用，不过由于Java方法与C/C++函数的调用约定不同，所以并不能直接调用，需要JavaC
算法学习笔记：17.蒙特卡洛算法 ——从原理到实战，涵盖 LeetCode 与考研 408 例题
在计算机科学和数学领域，蒙特卡洛算法（MonteCarloAlgorithm）以其独特的随机抽样思想，成为解决复杂问题的有力工具。从圆周率的计算到金融风险评估，从物理模拟到人工智能，蒙特卡洛算法都发挥着不可替代的作用。本文将深入剖析蒙特卡洛算法的思想、解题思路，结合实际应用场景与Java代码实现，并融入考研408的相关考点，穿插图片辅助理解，帮助你全面掌握这一重要算法。蒙特卡洛算法的基本概念蒙特卡
分布式学习笔记_04_复制模型 NzuCRAS 分布式学习笔记架构后端
常见复制模型使用复制的目的在分布式系统中，数据通常需要被分布在多台机器上，主要为了达到：拓展性：数据量因读写负载巨大，一台机器无法承载，数据分散在多台机器上仍然可以有效地进行负载均衡，达到灵活的横向拓展高容错&高可用：在分布式系统中单机故障是常态，在单机故障的情况下希望整体系统仍然能够正常工作，这时候就需要数据在多台机器上做冗余，在遇到单机故障时能够让其他机器接管统一的用户体验：如果系统客户端分布
Python之七彩花朵代码实现 PlutoZuo Python python 开发语言
Python之七彩花朵代码实现文章目录Python之七彩花朵代码实现下面是一个简单的使用Python的七彩花朵。这个示例只是一个简单的版本，没有很多高级功能，但它可以作为一个起点，你可以在此基础上添加更多功能。importturtleastuimportrandomasraimportmathtu.setup(1.0,1.0)t=tu.Pen()t.ht()colors=['red','skybl
Python 脚本最佳实践2025版
前文可以直接把这篇文章喂给AI,可以放到AI角色设定里,也可以直接作为提示词.这样,你只管提需求,写脚本就让AI来.概述追求简洁和清晰：脚本应简单明了。使用函数(functions)、常量(constants)和适当的导入(import)实践来有逻辑地组织你的Python脚本。使用枚举(enumerations)和数据类(dataclasses)等数据结构高效管理脚本状态。通过命令行参数增强交互性
（Python基础篇）循环结构 EternityArt 基础篇 python
一、什么是Python循环结构？循环结构是编程中重复执行代码块的机制。在Python中，循环允许你：1.迭代处理数据：遍历列表、字典、文件内容等。2.自动化重复任务：如批量处理数据、生成序列等。3.控制执行流程：根据条件决定是否继续或终止循环。二、为什么需要循环结构？假设你需要打印1到100的所有偶数：没有循环：需手动编写100行print()语句。print(0)print(2)print(4)
（Python基础篇）字典的操作 EternityArt 基础篇 python 开发语言
一、引言在Python编程中，字典（Dictionary）是一种极具灵活性的数据结构，它通过“键-值对”（key-valuepair）的形式存储数据，如同现实生活中的字典——通过“词语（键）”快速查找“释义（值）”。相较于列表和元组的有序索引访问，字典的优势在于基于键的快速查找，这使得它在处理需要频繁通过唯一标识获取数据的场景中极为高效。掌握字典的操作，能让我们更高效地组织和管理复杂数据，是Pyt
基于开源AI智能名片链动2+1模式与S2B2C商城小程序的渠道选择策略研究说私域人工智能小程序
摘要：在数字化商业环境下，品牌与产品的渠道选择对其市场推广和运营成功至关重要。本文聚焦于如何依据自身品牌和产品特性，结合开源AI智能名片链动2+1模式与S2B2C商城小程序，运用科学的渠道选择方法，慎重挑选1-2个适宜平台，集中资源发力并取得成绩后再拓展其他渠道。通过理论分析与案例研究，探讨该策略的有效性和可行性，为企业渠道布局提供参考。关键词：渠道选择；开源AI智能名片；链动2+1模式；S2B2
深入解析 TCP 连接状态与进程挂起、恢复与关闭誰能久伴不乏 tcp/ip 网络服务器
文章目录深入解析TCP连接状态与进程挂起、恢复与关闭一、TCP连接的各种状态1.**`LISTEN`**（监听）2.**`SYN_SENT`**（SYN已发送）3.**`SYN_RECEIVED`**（SYN已接收）4.**`ESTABLISHED`**（已建立）5.**`FIN_WAIT_1`**（关闭等待1）6.**`FIN_WAIT_2`**（关闭等待2）7.**`CLOSE_WAIT`**
基于架构的软件设计（Architecture-Based Software Design，ABSD）是一种以架构为核心的软件开发方法
ABSD方法与生命周期基于架构的软件设计（Architecture-BasedSoftwareDesign，ABSD）是一种以架构为核心的软件开发方法，强调在开发的各个阶段都要以架构为中心，确保系统的整体结构和质量属性得到有效管理。ABSD方法是一个自顶向下、递归细化的过程，软件系统的架构通过该方法得到细化，直到能产生软件构件和类。ABSD方法的三个基础功能的分解：使用基于模块的内聚和耦合技术，将
windows下源码安装golang 616050468 golang安装 golang环境 windows
系统： 64位win7，开发环境：sublime text 2， go版本： 1.4.1 1. 安装前准备(gcc, gdb, git) golang在64位系
redis批量删除带空格的key bylijinnan redis
redis批量删除的通常做法： redis-cli keys "blacklist*" | xargs redis-cli del 上面的命令在key的前后没有空格时是可以的，但有空格就不行了： $redis-cli keys "blacklist*" 1) "blacklist:12: [email protected]
oracle正则表达式的用法 0624chenhong oracle 正则表达式
方括号表达示方括号表达式描述 [[:alnum:]] 字母和数字混合的字符 [[:alpha:]] 字母字符 [[:cntrl:]] 控制字符 [[:digit:]] 数字字符 [[:graph:]] 图像字符 [[:lower:]] 小写字母字符 [[:print:]] 打印字符 [[:punct：]] 标点符号字符 [[:space:]]
2048源码(核心算法有，缺少几个anctionbar，以后补上) 不懂事的小屁孩 2048
2048游戏基本上有四部分组成， 1：主activity，包含游戏块的16个方格，上面统计分数的模块 2：底下的gridview，监听上下左右的滑动，进行事件处理， 3：每一个卡片，里面的内容很简单，只有一个text，记录显示的数字 4：Actionbar，是游戏用重新开始，设置等功能(这个在底下可以下载的代码里面还没有实现) 写代码的流程 1：设计游戏的布局，基本是两块，上面是分
jquery内部链式调用机理换个号韩国红果果 JavaScript jquery
只需要在调用该对象合适(比如下列的setStyles)的方法后让该方法返回该对象（通过this 因为一旦一个函数称为一个对象方法的话那么在这个方法内部this（结合下面的setStyles）指向这个对象） function create(type){ var element=document.createElement(type); //this=element;
你订酒店时的每一次点击背后都是NoSQL和云计算蓝儿唯美 NoSQL
全球最大的在线旅游公司Expedia旗下的酒店预订公司，它运营着89个网站，跨越68个国家，三年前开始实验公有云，以求让客户在预订网站上查询假期酒店时得到更快的信息获取体验。云端本身是用于驱动网站的部分小功能的，如搜索框的自动推荐功能，还能保证处理Hotels.com服务的季节性需求高峰整体储能。 Hotels.com的首席技术官Thierry Bedos上个月在伦敦参加“2015 Clou
java笔记1 a-john java
1，面向对象程序设计（Object-oriented Propramming，OOP）：java就是一种面向对象程序设计。 2，对象：我们将问题空间中的元素及其在解空间中的表示称为“对象”。简单来说，对象是某个类型的实例。比如狗是一个类型，哈士奇可以是狗的一个实例，也就是对象。 3，面向对象程序设计方式的特性： 3.1 万物皆为对象。
C语言 sizeof和strlen之间的那些事 C/C++软件开发求职面试题必备考点（一） aijuans C/C++求职面试必备考点
找工作在即，以后决定每天至少写一个知识点，主要是记录，逼迫自己动手、总结加深印象。当然如果能有一言半语让他人收益，后学幸运之至也。如有错误，还希望大家帮忙指出来。感激不尽。后学保证每个写出来的结果都是自己在电脑上亲自跑过的，咱人笨，以前学的也半吊子。很多时候只能靠运行出来的结果再反过来
程序员写代码时就不要管需求了吗？ asia007 程序员不能一味跟需求走
编程也有2年了，刚开始不懂的什么都跟需求走，需求是怎样就用代码实现就行，也不管这个需求是否合理，是否为较好的用户体验。当然刚开始编程都会这样，但是如果有了2年以上的工作经验的程序员只知道一味写代码，而不在写的过程中思考一下这个需求是否合理，那么，我想这个程序员就只能一辈写敲敲代码了。我的技术不是很好，但是就不代
Activity的四种启动模式百合不是茶 android 栈模式启动 Activity的标准模式启动栈顶模式启动单例模式启动
android界面的操作就是很多个activity之间的切换,启动模式决定启动的activity的生命周期 ; 启动模式xml中配置 <activity android:name=".MainActivity" android:launchMode="standard&quo
Spring中@Autowired标签与@Resource标签的区别 bijian1013 java spring @Resource @Autowired @Qualifier
Spring不但支持自己定义的@Autowired注解，还支持由JSR-250规范定义的几个注解，如：@Resource、 @PostConstruct及@PreDestroy。 1. @Autowired @Autowired是Spring 提供的，需导入 Package:org.springframewo
Changes Between SOAP 1.1 and SOAP 1.2 sunjing Changes Enable SOAP 1.1 SOAP 1.2
JAX-WS SOAP Version 1.2 Part 0: Primer (Second Edition) SOAP Version 1.2 Part 1: Messaging Framework (Second Edition) SOAP Version 1.2 Part 2: Adjuncts (Second Edition) Which style of WSDL
【Hadoop二】Hadoop常用命令 bit1129 hadoop
以Hadoop运行Hadoop自带的wordcount为例， hadoop脚本位于/home/hadoop/hadoop-2.5.2/bin/hadoop，需要说明的是，这些命令的使用必须在Hadoop已经运行的情况下才能执行 Hadoop HDFS相关命令 hadoop fs -ls 列出HDFS文件系统的第一级文件和第一级
java异常处理（初级）白糖_ java DAO spring 虚拟机 Ajax
从学习到现在从事java开发一年多了，个人觉得对java只了解皮毛，很多东西都是用到再去慢慢学习，编程真的是一项艺术，要完成一段好的代码，需要懂得很多。最近项目经理让我负责一个组件开发，框架都由自己搭建，最让我头疼的是异常处理，我看了一些网上的源码，发现他们对异常的处理不是很重视，研究了很久都没有找到很好的解决方案。后来有幸看到一个200W美元的项目部分源码，通过他们对异常处理的解决方案，我终
记录整理-工作问题 braveCS 工作
1）那位同学还是CSV文件默认Excel打开看不到全部结果。以为是没写进去。同学甲说文件应该不分大小。后来log一下原来是有写进去。只是Excel有行数限制。那位同学进步好快啊。 2）今天同学说写文件的时候提示jvm的内存溢出。我马上反应说那就改一下jvm的内存大小。同学说改用分批处理了。果然想问题还是有局限性。改jvm内存大小只能暂时地解决问题，以后要是写更大的文件还是得改内存。想问题要长远啊
org.apache.tools.zip实现文件的压缩和解压，支持中文 bylijinnan apache
刚开始用java.util.Zip，发现不支持中文（网上有修改的方法，但比较麻烦）后改用org.apache.tools.zip org.apache.tools.zip的使用网上有更简单的例子下面的程序根据实际需求，实现了压缩指定目录下指定文件的方法 import java.io.BufferedReader; import java.io.BufferedWrit
读书笔记-4 chengxuyuancsdn 读书笔记
1、JSTL 核心标签库标签 2、避免SQL注入 3、字符串逆转方法 4、字符串比较compareTo 5、字符串替换replace 6、分拆字符串 1、JSTL 核心标签库标签共有13个，学习资料：http://www.cnblogs.com/lihuiyy/archive/2012/02/24/2366806.html 功能上分为4类： (1)表达式控制标签：out
[物理与电子]半导体教材的一个小问题 comsci 问题
各种模拟电子和数字电子教材中都有这个词汇-空穴书中对这个词汇的解释是; 当电子脱离共价键的束缚成为自由电子之后,共价键中就留下一个空位,这个空位叫做空穴我现在回过头翻大学时候的教材,觉得这个
Flashback Database --闪回数据库 daizj oracle 闪回数据库
Flashback 技术是以Undo segment中的内容为基础的，因此受限于UNDO_RETENTON参数。要使用flashback 的特性，必须启用自动撤销管理表空间。在Oracle 10g中， Flash back家族分为以下成员： Flashback Database， Flashback Drop，Flashback Query(分Flashback Query,Flashbac
简单排序:插入排序 dieslrae 插入排序
public void insertSort(int[] array){ int temp; for(int i=1;i<array.length;i++){ temp = array[i]; for(int k=i-1;k>=0;k--)
C语言学习六指针小示例、一维数组名含义，定义一个函数输出数组的内容 dcj3sjt126com c
# include <stdio.h> int main(void) { int * p; //等价于 int *p 也等价于 int* p; int i = 5; char ch = 'A'; //p = 5; //error //p = &ch; //error //p = ch; //error p = &i; //
centos下php redis扩展的安装配置3种方法 dcj3sjt126com redis
方法一 1.下载php redis扩展包代码如下复制代码 #wget http://redis.googlecode.com/files/redis-2.4.4.tar.gz 2 tar -zxvf 解压压缩包，cd /扩展包（进入扩展包然后运行phpize 一下是我环境中phpize的目录，/usr/local/php/bin/phpize (一定要
线程池(Executors) shuizhaosi888 线程池
在java类库中，任务执行的主要抽象不是Thread，而是Executor，将任务的提交过程和执行过程解耦 public interface Executor { void execute(Runnable command); } public class RunMain implements Executor{ @Override pub
openstack 快速安装笔记 haoningabc openstack
前提是要配置好yum源版本icehouse，操作系统redhat6.5 最简化安装，不要cinder和swift 三个节点 172 control节点keystone glance horizon 173 compute节点nova 173 network节点neutron control /etc/sysctl.conf net.ipv4.ip_forward =
从c面向对象的实现理解c++的对象（二） jimmee C++面向对象虚函数
1. 类就可以看作一个struct，类的方法，可以理解为通过函数指针的方式实现的，类对象分配内存时，只分配成员变量的，函数指针并不需要分配额外的内存保存地址。 2. c++中类的构造函数，就是进行内存分配(malloc)，调用构造函数 3. c++中类的析构函数，就时回收内存(free) 4. c++是基于栈和全局数据分配内存的，如果是一个方法内创建的对象，就直接在栈上分配内存了。专门在
如何让那个一个div可以拖动 lingfeng520240 html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml
第10章高级事件（中） onestopweb 事件
index.html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/
计算两个经纬度之间的距离 roadrunners 计算纬度 LBS 经度距离
要解决这个问题的时候，到网上查了很多方案，最后计算出来的都与百度计算出来的有出入。下面这个公式计算出来的距离和百度计算出来的距离是一致的。 /** * * @param longitudeA * 经度A点 * @param latitudeA * 纬度A点 * @param longitudeB *
最具争议的10个Java话题 tomcat_oracle java
1、Java8已经到来。什么！？ Java8 支持lambda。哇哦，RIP Scala！　　随着Java8 的发布，出现很多关于新发布的Java8是否有潜力干掉Scala的争论，最终的结论是远远没有那么简单。Java8可能已经在Scala的lambda的包围中突围，但Java并非是函数式编程王位的真正觊觎者。　　2、Java 9 即将到来　　 Oracle早在8月份就发布
zoj 3826 Hierarchical Notation(模拟) 阿尔萨斯 rar
题目链接：zoj 3826 Hierarchical Notation 题目大意：给定一些结构体，结构体有value值和key值，Q次询问，输出每个key值对应的value值。解题思路：思路很简单，写个类词法的递归函数，每次将key值映射成一个hash值，用map映射每个key的value起始终止位置，预处理完了查询就很简单了。这题是最后10分钟出的，因为没有考虑value为{}的情