docker容器中的cronjob无法连接到其他容器



我想使用cronjob来运行一个脚本,即从新闻api中获取数据,并将其馈送到位于其他容器中的postegres中。

因此简化的体系结构是

app(in container) -> postegres(in container)

cronjob脚本在应用程序中,它将获取数据,然后发送到postegres。

在我的crontab中是

* * * * * cd /tourMamaRoot/tourMama/cronjob && fetch_news.py >> /var/log/cron.log 2>&1

我可以通过手动运行脚本成功地运行它,但当我把它放在crontab中时,它会显示错误。

File "/usr/local/lib/python3.6/dist-packages/django/db/backends/base/base.py", line 195, in connect
self.connection = self.get_new_connection(conn_params)
File "/usr/local/lib/python3.6/dist-packages/django/db/backends/postgresql/base.py", line 178, in get_new_connection
connection = Database.connect(**conn_params)
File "/usr/local/lib/python3.6/dist-packages/psycopg2/__init__.py", line 126, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

如果使用crontab,它似乎只在本地查找数据库,我如何设置它将数据放入其他容器,就像手动运行脚本一样?

信息:

我的应用程序docker容器是Ubuntu 18.04版本,下面是我的应用 docker文件

FROM ubuntu:18.04
MAINTAINER Eson
ENV PYTHONUNBUFFERED 1
ENV DEBIAN_FRONTEND=noninteractive
EXPOSE 8000
# Setup directory structure
RUN mkdir /tourMamaRoot
WORKDIR /tourMamaRoot/tourMama/
COPY tourMama/requirements/base.txt /tourMamaRoot/base.txt
COPY tourMama/requirements/dev.txt /tourMamaRoot/requirements.txt
# install Python 3
RUN apt-get update && apt-get install -y 
software-properties-common
RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt-get update && apt-get install -y 
python3.7 
python3-pip
RUN python3.7 -m pip install pip
RUN apt-get update && apt-get install -y 
python3-distutils 
python3-setuptools
# install Postgresql
RUN apt-get -y install wget ca-certificates
RUN wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -
RUN sh -c echo deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main >> /etc/apt/sources.list.d/pgdg.list
RUN apt-get update
RUN apt-get install -y postgresql postgresql-contrib
# Install some dep
RUN apt-get install net-tools
RUN apt-get install -y libpq-dev python-dev
RUN pip3 install -r /tourMamaRoot/requirements.txt
# Copy application
COPY ./tourMama/ /tourMamaRoot/tourMama/

docker组合文件:

version: '3'
services:
app:
build:
# current directory
# if for dev, need to have Dockerfile.dev in folder
dockerfile: docker/dev/Dockerfile
context: .
ports:
#host to image
- "8000:8000"
volumes:
# map directory to image, which means if something changed in
# current directory, it will automatically reflect on image,
# don't need to restart docker to get the changes into effect
- ./tourMama:/tourMamaRoot/tourMama
command: >
sh -c "python3 manage.py wait_for_db &&
python3 manage.py makemigrations &&
python3 manage.py migrate &&
python3 manage.py runserver 0.0.0.0:8000 &&
sh initial_all.sh"
environment:
- DB_HOST=db
- DB_NAME=app
- DB_USER=postgres
- DB_PASS=supersecretpassword
depends_on:
- db
- redis
db:
image: postgres:11-alpine
ports:
#host to image
- "5432:5432"
environment:
- POSTGRES_DB=app
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=supersecretpassword
redis:
image: redis:5.0.5-alpine
ports:
#host to image
- "6379:6379"
#    command: ["redis-server", "--appendonly", "yes"]
#    hostname: redis
#    networks:
#      - redis-net
#    volumes:
#      - redis-data:/data

我的cronjob脚本是:

import os
import sys
import django
from django.db import IntegrityError
from newsapi.newsapi_client import NewsApiClient
sys.path.append("../")
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "tourMama.settings")
django.setup()
from news.models import News
from tourMama_app.models import Category
from config.script import categorization_loader
load_category = categorization_loader.load_category_data("catagorization.yml")
categories = list(load_category.keys())
countries = ["us", "gb"]
# Init
newsapi = NewsApiClient(api_key='secret')
for category in categories:
for country in countries:
category_lower = category.lower()
category_obj = Category.objects.filter(
category=category,
).get()
top_headlines = newsapi.get_top_headlines(q='',
# sources=object'bbc-news,the-verge',
category=category_lower,
language='en',
page_size=100,
country=country
)
for article in top_headlines.get("articles"):
try:
News.objects.create(
source=article["source"].get("name") if article["source"] else None,
title=article.get("title"),
author=article.get("author"),
description=article.get("description"),
url=article.get("url"),
urlToImage=article.get("urlToImage"),
published_at=article.get("publishedAt"),
content=article.get("content"),
category=category_obj
)
except IntegrityError:
print("data already exist")
else:
print("data insert successfully")

如果需要,我的django设置文件如下:

"""
Django settings for tourMama project.
Generated by 'django-admin startproject' using Django 2.2.1.
For more information on this file, see
https://docs.djangoproject.com/en/2.2/topics/settings/
For the full list of settings and their values, see
https://docs.djangoproject.com/en/2.2/ref/settings/
"""
import os
# Build paths inside the project like this: os.path.join(BASE_DIR, ...)
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
TEMPLATE_DIR = os.path.join(BASE_DIR,"templates")
# Quick-start development settings - unsuitable for production
# See https://docs.djangoproject.com/en/2.2/howto/deployment/checklist/
# SECURITY WARNING: keep the secret key used in production secret!
SECRET_KEY = 'd084cm20*x*&s&w)vq+7*teea540yny+fyi^dh57nxiff&a#25'
# SECURITY WARNING: don't run with debug turned on in production!
DEBUG = True
COMPRESS_ENABLED = False
COMPRESS_CSS_HASHING_METHOD = 'content'
COMPRESS_FILTERS = {
'css':[
'compressor.filters.css_default.CssAbsoluteFilter',
'compressor.filters.cssmin.rCSSMinFilter',
],
'js':[
'compressor.filters.jsmin.JSMinFilter',
]
}
HTML_MINIFY = False
KEEP_COMMENTS_ON_MINIFYING = False
ALLOWED_HOSTS = ['0.0.0.0', "127.0.0.1"]

# Application definition
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'channels',
'bootstrap3',
'tourMama_app',
'account',
'posts',
'group',
'news',
'statistics',
'compressor',
]
AUTH_USER_MODEL = "account.UserProfile"
MIDDLEWARE = [
'django.middleware.gzip.GZipMiddleware',
'htmlmin.middleware.HtmlMinifyMiddleware',
'htmlmin.middleware.MarkRequestMiddleware',
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
]
ROOT_URLCONF = 'tourMama.urls'
TEMPLATES = [
{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'DIRS': [TEMPLATE_DIR,],
'APP_DIRS': True,
'OPTIONS': {
'context_processors': [
'django.template.context_processors.debug',
'django.template.context_processors.request',
'django.contrib.auth.context_processors.auth',
'django.contrib.messages.context_processors.messages'   ,
],
},
},
]
WSGI_APPLICATION = 'tourMama.wsgi.application'
ASGI_APPLICATION = 'tourMama.routing.application'
# https://stackoverflow.com/questions/56480472/cannot-connect-to-redis-container-from-app-container/56480746#56480746
CHANNEL_LAYERS = {
'default': {
'BACKEND': 'channels_redis.core.RedisChannelLayer',
'CONFIG': {
"hosts": [('redis', 6379)],
},
},
}

# Database
# https://docs.djangoproject.com/en/2.2/ref/settings/#databases
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'HOST': os.environ.get('DB_HOST'),
'NAME': os.environ.get('DB_NAME'),
'USER': os.environ.get('DB_USER'),
'PASSWORD': os.environ.get('DB_PASS')
}
}

# Password validation
# https://docs.djangoproject.com/en/2.2/ref/settings/#auth-password-validators
AUTH_PASSWORD_VALIDATORS = [
{
'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
},
]
STATICFILES_FINDERS = (
'django.contrib.staticfiles.finders.FileSystemFinder',
'django.contrib.staticfiles.finders.AppDirectoriesFinder',
# other finders..
'compressor.finders.CompressorFinder',
)
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
'LOCATION': '127.0.0.1:11211',
}
}
# Internationalization
# https://docs.djangoproject.com/en/2.2/topics/i18n/
LANGUAGE_CODE = 'en-us'
TIME_ZONE = 'UTC'
USE_I18N = True
USE_L10N = True
USE_TZ = True

# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/2.2/howto/static-files/

STATIC_URL = '/static/'
STATICFILES_DIRS = [os.path.join(BASE_DIR, 'static'),]
STATIC_ROOT = os.path.join(BASE_DIR,"static_root")
MEDIA_URL = '/media/'
MEDIA_ROOT = os.path.join(BASE_DIR, 'media')
LOGIN_REDIRECT_URL = "home:index"
LOGOUT_REDIRECT_URL = "home:index"
environment:
- DB_HOST=db
- DB_NAME=app
- DB_USER=postgres
- DB_PASS=supersecretpassword

我看到您通过docker compose传递环境变量,如下所示。当容器直接在shell中运行命令时,这是可以的。

但是,当把它放在crontab中时,cronjob将在一个单独的新shell中运行您的命令,而根本不传入任何环境。

为了解决这个问题,您可以创建一个单独的shell脚本:

cat <<EOF > /temp/script.sh
#!/bin/bash
export DB_HOST=db
export DB_NAME=app
export DB_USER=postgres
export DB_PASS=supersecretpassword
cd /tourMamaRoot/tourMama/cronjob && fetch_news.py >> /var/log/cron.log 2>&1
EOF
chmod +x /temp/script.sh

并像这样编辑您的crontab:

* * * * * /temp/script.sh

最新更新