如何连接Jupyter Ipython笔记本到亚马逊红移



我使用Mac Yosemite。我已经安装了包postgresql, psycopg2和simplejson使用conda安装"包名"。安装完成后,我导入了这些包。我试图创建一个json文件与我的亚马逊红移凭证

{
    "user_name": "YOUR USER NAME",
    "password": "YOUR PASSWORD",
    "host_name": "YOUR HOST NAME",
    "port_num": "5439",
    "db_name": "YOUR DATABASE NAME"
}

我用

open("Credentials.json") as fh:
    creds = simplejson.loads(fh.read())

但是这是抛出错误。这些是网站上给出的说明。我试着搜索了其他网站,但没有一个网站给出一个很好的解释。

请告诉我如何将木星与亚马逊红移连接起来。

这里有一个来自RJMetrics的很好的指南:"使用Jupyter Notebook设置您的分析堆栈&AWS红移"。它使用ipython-sql

这个效果很好,并在网格中显示结果。

In [1]:

import sqlalchemy
import psycopg2
import simplejson
%load_ext sql
%config SqlMagic.displaylimit = 10

In [2]:

with open("./my_db.creds") as fh:
    creds = simplejson.loads(fh.read())
connect_to_db = 'postgresql+psycopg2://' + 
                creds['user_name'] + ':' + creds['password'] + '@' + 
                creds['host_name'] + ':' + creds['port_num'] + '/' + creds['db_name'];
%sql $connect_to_db

In [3]:

% sql SELECT * FROM my_table LIMIT 25;

我是这样做的:

----INSERT IN CELL 1-----
import psycopg2
redshift_endpoint = "<add your endpoint>"
redshift_user = "<add your user>"
redshift_pass = "<add your password>"
port = <your port>
dbname = "<your db name>"
----INSERT IN CELL 2-----
from sqlalchemy import create_engine
from sqlalchemy import text
engine_string = "postgresql+psycopg2://%s:%s@%s:%d/%s" 
% (redshift_user, redshift_pass, redshift_endpoint, port, dbname)
engine = create_engine(engine_string)
----INSERT IN CELL 3 - THIS EXAMPLE WILL GET ALL TABLES FROM YOUR DATABASE-----
sql = """
select schemaname, tablename from pg_tables order by schemaname, tablename;
"""
----LOAD RESULTS AS TUPLES TO A LIST-----
tables = []
output = engine.execute(sql)
for row in output:
    tables.append(row)
tables
--IF YOU'RE USING PANDAS---
raw_data = pd.read_sql_query(text(sql), engine)

最简单的方法是使用这个扩展-https://github.com/sat28/jupyter-redshift

示例笔记本显示了它如何将redshift实用程序作为ippython Magic加载。

编辑1

支持回写redshift数据库也被添加。

最新更新