PySpark实战语句

code1
feature1 = "id, application_id, user_profile_id, amount"
sql1 = """SELECT %s FROM tb_source_data.loan_applications LIMIT %d"""%(feature1, 3)
hiveContext.sql(sql1).show(1000, truncate=False)

等价于

hiveContext.sql("""SELECT %s FROM tb_source_data.loan_applications LIMIT %d"""%("id, application_id, user_profile_id, amount", 3)).show(1000, truncate=False)

输出结果

+-----+----------------------------+---------------+--------+
|id   |application_id              |user_profile_id|amount  |
+-----+----------------------------+---------------+--------+
|18132|AAAA17071813423573529711111 |17322          |0.0     |
|18133|BBBBB17071813472976219211111|17323          |100000.0|
|18134|CCCC17071813490193476111111 |17324          |0.0     |
+-----+----------------------------+---------------+--------+
code2

查看一列的数据

for i in hiveContext.sql("DESC tb_source_data.loan_applications").collect():
    print i[0]+","
[output]:
id,
application_id,
user_profile_id,
amount,
tenor,

你可能感兴趣的:(PySpark实战语句)