python使用自动化脚本
Ever dreamed about having your own bot? A little helper that spreads your wisdom and helps you grow you audience. We will take the first step today and learn how to register, deploy and build a Python based Twitter bot using Tweepy. Completely free using an AWS micro instance.
曾经梦想拥有自己的机器人? 一个小助手,可以传播您的智慧并帮助您扩大受众范围。 我们今天将迈出第一步,学习如何使用Tweepy注册,部署和构建基于Python的Twitter机器人。 使用AWS微型实例完全免费。
You can checkout the bot we are going to build here:
您可以在此处签出要构建的机器人:
https://twitter.com/PythonCodeHub
Our bot 我们的机器人
总览 (Overview)
Depending on your knowledge you may be able to skip one or two sections. We basically have to complete the following steps to get a working twitter bot running permanently.
根据您的知识,您可能可以跳过一两个部分。 我们基本上必须完成以下步骤,才能使运行中的Twitter机器人永久运行。
- Write the bot 编写机器人
- Register a twitter account and get developer access 注册一个Twitter帐户并获得开发人员访问权限
Deploy our creation to AWS (Amazon Web Services) FREE
我们的部署创建到AWS(亚马逊网络服务) 免费
We will start with Step 1, such that you can understand the basics and can program your bot before getting into the details of making it permanently available. If you know how to handle AWS you should be set in under 20 minutes!
我们将从第1步开始,这样您就可以了解基本知识并可以对bot进行编程,然后再使其永久可用。 如果您知道如何处理AWS,则应在20分钟内完成设置!
写你的机器人 (Write Your Bot)
We will be using the excellent Tweepy library for Python. This makes working with Twitter as easy as clapping for this story. I will leave the entire code at the bottom lets go through it step by step.
我们将为Python使用出色的Tweepy库。 这使得与Twitter的合作就像鼓掌这个故事一样容易。 我将整个代码留在底部,让我们逐步进行操作。
#!/usr/bin/env python3
API_KEY = "YOUR_API_KEY"
API_SECRET = "YOUR_API_SECRET"
ACCESS_TOKEN = "YOUR_ACCESS_TOKEN"
ACCESS_TOKEN_SECRET = "YOUR_ACCESS_TOKEN_SECRET"
This should be at the very top of your file. We need the 4 keys, which we will get from Twitter to get our bot connected to their network. Generally it is very bad practice to store your access keys like this in a plain file (We will do it anyway ;) But once you understood the basics and got your first bot running I strongly encourage you to find a better way to store such keys, or encrypt the files.
这应该在文件的最顶部。 我们需要从Twitter获得的4个密钥,以使我们的漫游器连接到他们的网络。 通常,将这样的访问密钥存储在一个普通文件中是非常不好的做法(无论如何,我们都会这样做;)但是一旦您了解了基本知识并运行了第一个机器人,我强烈建议您找到一种更好的存储此类密钥的方法,或加密文件。
We will use the libraries Tweepy and Schedule. Tweepy will handle everything you can possibly want out of the Twitter-Api, such as tweeting, direct messaging and searching for other tweets. Finally we will be using the datetime library to handle all date related issues we will encounter.
我们将使用Tweepy和Schedule库。 Tweepy将处理您可能希望从Twitter-Api中获得的所有内容,例如推文,直接消息传递和搜索其他推文。 最后,我们将使用datetime库来处理我们将遇到的所有与日期相关的问题。
For this to work let’s install them with pip
为此,让我们使用pip安装它们
pip install tweepy schedule
And import them in our file let’s call it py_bot.py for later
并将它们导入到我们的文件中,我们稍后将其称为py_bot.py
import tweepy
import schedule
from datetime import datetime, timedelta, timezone
代码 (The Code)
Our Bot will life in a magnificent class called PythonBot. We will start with the run_bot method which highlights what we are trying to achieve.
我们的机器人将生活在一个名为PythonBot的宏伟的类中。 我们将从run_bot方法开始,该方法突出显示了我们要实现的目标。
This is the heart of our bot. Here we will do all the work and run this methods each 10 minutes. Basically we first define a list of search terms we want to iterate over and then search within the Twitter-Api for tweets that contain these terms. Once we have our Tweet list we select the most popular one and retweet it. Simple? Let’s see how it works
这是我们机器人的心脏。 在这里,我们将完成所有工作并每10分钟运行一次此方法。 基本上,我们首先定义要迭代的搜索词列表,然后在Twitter-Api中搜索包含这些词的推文。 收到Tweet列表后,我们选择最受欢迎的列表并转发。 简单? 让我们看看它是如何工作的
As we can see we first have a for loop over our search terms. For each term we call the Twitter-API calling the “self.api.search()”, you can find a complete documentation of what you can do here. This basically returns a list of Tweets e.g.
如我们所见,我们首先在搜索词上有一个for循环。 对于每个称为Twitter-API的术语,我们将其称为“ self.api.search()”,您可以在此处找到有关您可以做什么的完整文档。 这基本上返回了推文列表,例如
[
{‘_json’:{'text':“Hello World I am a Tweet”,’created_at’:’10.10.2020 15:12’},’location’:’Germany’}},
{‘_json’:{'text':“Hello World I am a Tweet”,’created_at’:’10.10.2020 15:13’},’location’:’USA’}}
] (strongly simplified version)
We then concatenate these tweets in a list and work from there. After this we should have around 5k tweets (at line 6).
然后,我们将这些tweet连接到一个列表中,然后从那里开始工作。 之后,我们应该有大约5k条推文(第6行)。
From here all we have to do is filter them. First we get ride of all the unneeded information inside a tweet, like location etc. We use a little trick called list compression, if you are not familiar with it check it out here. We are now left with (at line 9).
从这里开始,我们要做的就是过滤它们。 首先,我们可以在推文中获取所有不需要的信息,例如位置等。我们使用一个称为列表压缩的小技巧,如果您不熟悉它,请在此处查看 。 现在,我们剩下了(第9行)。
[
{'text':“Hello World I am a Tweet”,’created_at’:’10.10.2020 15:12’},
{'text':“Hello World I am a Tweet”,’created_at’:’10.10.2020 15:13’}
]
Now we want to make sure that we only tweet new things. Namely we check if the tweets creation date is bigger (newer) than our last tweeting time. For this we use the library datetime. We basically convert the time from “10.10.2020 15:12”
, to Time(day=10,month=10,year=2020,hour=15,min=12)
this will allow us to compare them using the >
operator. Sounds complicated but we basically just check if the tweet is newer then our last tweet time. We will define the exact function self.str_to_time
in a bit.
现在,我们要确保只发布新内容。 也就是说,我们检查推文创建日期是否大于(上次)我们的上次发布时间。 为此,我们使用库的日期时间。 我们基本上将时间从“10.10.2020 15:12”
转换为Time(day=10,month=10,year=2020,hour=15,min=12)
这将允许我们使用>
运算符进行比较。 听起来很复杂,但我们基本上只是检查该tweet是否比我们的最后tweet时间新。 我们将稍后定义确切的函数self.str_to_time
。
At line 14 we check that we only are left with original tweets and no retweets, since we want to publish new content.
在第14行,由于我们要发布新内容,因此我们检查是否只留下了原始推文,而没有转发。
All we have to do now is select the most popular tweet. We do this in line 15 with the builtin python method max. How it basically works is that we give it a function self.selection_function
and this will return us the tweet with the biggest score (as defined by self.selection_function
). We will define this method in a bit. What it basically does is, given a tweet is sums up its likes and retweets to obtain a score. We then select the tweet with the most Retweets+Likes
我们现在要做的就是选择最受欢迎的推文。 我们在第15行中使用内置的python方法max进行此操作。 它的基本工作原理是为它提供一个函数self.selection_function
,这将向我们返回得分最高的推文(如self.selection_function
所定义)。 我们将稍候定义此方法。 基本上,给定一条推文就是总结其喜欢和转发,以获得分数。 然后,我们选择Retweets+Likes
次数最多的Retweets+Likes
Once we found our tweet we will retweet it and spread the word faster than COVID-19!
找到推文后,我们将转推它,并且比COVID-19传播更快。
We put all of this inside a try-catch block since it is the internet and things often go wrong.
我们将所有内容放到try-catch块中,因为它是Internet,并且经常出错。
__init __() (The __init__())
Now we are almost through the code! All we have to do is set up the functions and some basic handling of the times. We do this inside our __init__()
现在我们几乎完成了代码! 我们要做的就是设置功能和一些基本的时间处理方式。 我们在__init__()
内部执行此操作
The first 6 lines are basically there to connect us to the Twitter-API such that we can call it as we did above. Now we need to define the two functions we encounter earlier. We will use so called lambda functions, if you are not familiar with them check them out here. But basically they are just functions that can be defined with less code.
前6行基本上可以将我们连接到Twitter-API,因此我们可以像上面一样调用它。 现在我们需要定义我们前面遇到的两个函数。 我们将使用所谓的lambda函数,如果您不熟悉它们,请在此处查看 。 但基本上,它们只是可以用更少的代码定义的函数。
self.str_to_time(x,’pattern’)
this function converts our date to a datetime
object as we discussed above. When we receive a time from twitter it will look like
self.str_to_time(x,'pattern')
此函数将我们的日期转换为datetime
对象,如上所述。 当我们收到来自Twitter的时间时,它看起来像
Thu Aug 13 21:53:32 +0000 2020
using the datetime
library we can convert this to a datetime
object, such that we can make simple >
comparisons between them. If you want to know more about it check this out. It really is not that important, so don’t worry too much about it, if you haven’t encountered time in Python yet.
使用datetime
库,我们可以将其转换为datetime
对象,以便我们可以在它们之间进行简单的>
比较。 如果您想了解更多信息,请查看此 。 它的确不是那么重要,因此,如果您还没有在Python中使用过的时间,请不要担心太多。
self.selection_function(tweet)
: this sums up the retweet count and the like count. In the end it will output a number e.g. 10 Retweets + 5 Likes =15.
self.selection_function(tweet)
:这总结了转推计数和类似计数。 最后,它将输出一个数字,例如10转推+ 5赞= 15。
We have only one little issue left, when we start the bot we don’t have a last running time. One can solve this by defining it as now -10 minutes
, which is the same as the expression:
我们只剩下一个小问题了,当我们启动机器人时,我们没有最后的运行时间。 可以通过将其定义为now -10 minutes
来解决此问题,该表达式与表达式相同:
self.last_tweet_time = datetime.now(timezone.utc) - timedelta(minutes=10)
Which will results in the whatever time was 10 minutes ago.
这将导致10分钟前的任何时间。
Now all we have to do is to schedule the function self.run_bot()
to run every 10 minutes and we are good to go. We do this using our schedule library. To run our bot we additionally define a main
entry point.
现在我们要做的就是安排self.run_bot()
函数每10分钟运行一次,我们很高兴。 我们使用时间表库来执行此操作。 为了运行我们的机器人,我们还定义了一个main
入口点。
if __name__ == '__main__':
bot = PythonBot()
while True:
schedule.run_pending()
Here we create our bot and the run all scheduled functions forever — THAT’S IT!
在这里,我们创建机器人并永久运行所有计划的功能-就是这样!
You can find the entire code at the bottom for easy copy+paste. The next sections will cover how to obtain the API-Key and deploy it to AWS(not necessary but fancy)
您可以在底部找到完整的代码,以方便复制和粘贴。 下一节将介绍如何获取API密钥并将其部署到AWS(不是必需的,但很花哨)
获取您的Twitter KEYS (Get you Twitter KEYS)
To be allowed to communicate with Twitter they force you to obtain so called API-KEYS. These are like passwords and can be used to make sure that twitter knows that it has to tweet under your account.
为了允许与Twitter通信,它们迫使您获得所谓的API-KEYS。 这些就像密码一样,可用于确保Twitter知道它必须在您的帐户下进行鸣叫。
If you want you bot to have its own identity on Twitter make sure to create a new account and ADD AN EMAIL AND A PHONE NUMBER. You can also use your own account.
如果您希望漫游器在Twitter上拥有自己的身份,请确保创建一个新帐户并添加电子邮件和电话号码。 您也可以使用自己的帐户。
Apply for a developer account https://developer.twitter.com/en/apply-for-access. Choose Making a bot and follow the instructions. At some point select the following
申请开发者帐户https://developer.twitter.com/en/apply-for-access 。 选择制作机器人并按照说明进行操作。 在某些时候选择以下内容
Once this is completed you get two keys the “API-KEY” and the “API-SECRET” store them for later. Now we create an app at https://developer.twitter.com/en/portal/projects-and-apps
完成此操作后,您将获得两个密钥,“ API-KEY”和“ API-SECRET”将它们存储起来以备后用。 现在,我们在https://developer.twitter.com/en/portal/projects-and-apps创建一个应用
One more thing we need to change the Apps access setting to “Read and Write” you can do so in you App settings.
我们还需要将“应用程序”访问权限设置更改为“ 读写 ”,这可以在您的应用程序设置中进行。
After this you can click here to get your other two keys “Access Token” and “Access Secret” (but make sure you first set the Read and Write permission). After clicking the key hit the generate access token button and it should show up.
之后,您可以单击此处以获取其他两个键“访问令牌”和“访问机密”(但请确保首先设置“ 读取和写入”许可权 )。 单击键后,单击“生成访问令牌”按钮,它应该显示出来。
We have now our 4 KEYS
现在我们有4个钥匙
You can not put those 4 keys into the placeholders in the file and run the bot. Well done you have now a working bot that tweets every 10 minutes the most interesting news of the Python field!
您不能将这4个密钥放入文件中的占位符并运行bot 。 做得好,您现在有了一个可以正常运行的机器人 ,该机器人每10分钟发布一次Python领域最有趣的新闻!
部署到AWS (Deploy it to AWS)
Since running this bot 24/7 on your own computer might not always be ideal, we will deploy it to the cloud at AWS. Amazon has so called micro instances which are free and good enough to host multiple of these bots.
由于在您自己的计算机上运行此bot 24/7可能并不总是理想的,因此我们将其部署到AWS的云中。 亚马逊拥有所谓的微型实例,这些微型实例免费且足以托管多个此类机器人。
If you have never worked with AWS before it might be advisable to watch a small video that gets you started with the basics. It is not complicated but it has a lot of buttons.
如果您以前从未使用过AWS,那么建议您观看一个小型视频以开始使用基础知识。 它并不复杂,但是有很多按钮。
一步步 (Step-by-Step)
- Start a AWS micro instance in your favorite region 在您喜欢的区域中启动一个AWS微型实例
- Connect to the instance using ssh 使用ssh连接到实例
- Install python 安装python
sudo yum install python3 -y
4. Install the dependencies:
4.安装依赖项:
sudo pip3 install tweepy schedule
5. Put the code into a file called ‘py_bot.py’ either through using git or simply by copy pasting.
5.通过使用git或简单地通过复制粘贴将代码放入名为“ py_bot.py”的文件中。
6. Give the file permissions:
6.授予文件权限:
chmod +x py_bot.py
7. Run the server using:
7.使用以下命令运行服务器:
nohup python3
This command runs your bot such that if you log out it will continue to run. You can monitor what your bot does by looking at the ‘py_bot_log.txt’ file.
此命令将运行您的机器人,这样,如果您注销,它将继续运行。 您可以通过查看“ py_bot_log.txt”文件来监控您的漫游器。
Congratulations you just deployed your first Twitterbot! I am very proud of you, this was not easy. Make sure to stay tuned, if you run into any issues let me know. Beautiful sharing knowledge with you.
恭喜您刚刚部署了第一个Twitterbot! 我为你感到骄傲 ,这并不容易。 如果您遇到任何问题,请随时与我联系,请务必保持关注。 与您分享美丽的知识。
完整代码 (Full Code)
翻译自: https://medium.com/python-in-plain-english/automate-your-social-media-presence-with-python-26e95137fa04
python使用自动化脚本