19 Oct 2017

feedPlanet Python

PyCharm: PyCharm 2017.3 EAP 6: Improved interpreter selection interface

The latest and greatest early access program (EAP) version of PyCharm is now available from our website:

Get PyCharm 2017.3 EAP 6

Improved Python interpreter selection interface

In the previous PyCharm 2017.3 EAP 5 build we introduced a new UI for configuring project interpreters in your existing projects. To recap:

This build adds new UI for configuring project interpreters during project creation.
Go to File | New project…:newproject

On the New Project dialog you get a similar experience to that described above for the existing project interpreter selection. Here, if you'd like to reuse an existing virtualenv or system interpreter, you can choose it under 'Existing interpreter'. And if you create a new virtualenv and would like to reuse it for further projects in the future, you can check 'Make available to all projects' and it will appear in the dropdown on the project interpreter page for all other projects.

Install PyCharm with snap packages

Starting from this EAP build, we offer an alternative installation method using snap packages for Ubuntu users. Snaps are quick to install, safe to run, easy to manage and they
are updated automatically in the background every day, so you always have fresh builds once they're out. If you are running Ubuntu 16.04 LTS or later, snap is already preinstalled, so you can use snaps from the command line right away.

Installing PyCharm Professional or Community Edition 2017.3 EAP is now as easy as this simple command (please make sure you use just one option from square brackets):

$sudo snap install [pycharm-professional | pycharm-community] --classic --edge

This command will install PyCharm Professional or Community from the "Edge" channel where we store EAP builds. Please note, snap installation method is experimental, and currently, we officially distribute only PyCharm 2017.3 EAP in the Edge channel.

Depending on which snap you've installed, you can run your fresh PyCharm 2017.3 EAP with:

$[pycharm-professional | pycharm-community]

Now you can use other snap commands for managing your snaps. The most frequently used commands are:

Snap supports auto updates from the stable channel only. Since we distribute 2017.3 EAP builds in the edge channel, to update to the next EAP build, you'll need to invoke manually:

$snap refresh [pycharm-professional | pycharm-community] --edge

Read more about how snaps work and let us know about your experience using snap for installing and updating PyCharm 2017.3 EAP so we can consider snaps for our official stable releases. You can give us your feedback on Twitter or in the comments to this blog post.

Other improvements in this build:

If these features sound interesting to you, try them yourself:

Get PyCharm 2017.3 EAP 6

As a reminder, PyCharm EAP versions:

If you run into any issues with this version, or another version of PyCharm, please let us know on our YouTrack. If you have other suggestions or remarks, you can reach us on Twitter, or by commenting on the blog.

19 Oct 2017 7:07pm GMT

PyCharm: PyCharm 2017.3 EAP 6: Improved interpreter selection interface

The latest and greatest early access program (EAP) version of PyCharm is now available from our website:

Get PyCharm 2017.3 EAP 6

Improved Python interpreter selection interface

In the previous PyCharm 2017.3 EAP 5 build we introduced a new UI for configuring project interpreters in your existing projects. To recap:

This build adds new UI for configuring project interpreters during project creation.
Go to File | New project…:newproject

On the New Project dialog you get a similar experience to that described above for the existing project interpreter selection. Here, if you'd like to reuse an existing virtualenv or system interpreter, you can choose it under 'Existing interpreter'. And if you create a new virtualenv and would like to reuse it for further projects in the future, you can check 'Make available to all projects' and it will appear in the dropdown on the project interpreter page for all other projects.

Install PyCharm with snap packages

Starting from this EAP build, we offer an alternative installation method using snap packages for Ubuntu users. Snaps are quick to install, safe to run, easy to manage and they
are updated automatically in the background every day, so you always have fresh builds once they're out. If you are running Ubuntu 16.04 LTS or later, snap is already preinstalled, so you can use snaps from the command line right away.

Installing PyCharm Professional or Community Edition 2017.3 EAP is now as easy as this simple command (please make sure you use just one option from square brackets):

$sudo snap install [pycharm-professional | pycharm-community] --classic --edge

This command will install PyCharm Professional or Community from the "Edge" channel where we store EAP builds. Please note, snap installation method is experimental, and currently, we officially distribute only PyCharm 2017.3 EAP in the Edge channel.

Depending on which snap you've installed, you can run your fresh PyCharm 2017.3 EAP with:

$[pycharm-professional | pycharm-community]

Now you can use other snap commands for managing your snaps. The most frequently used commands are:

Snap supports auto updates from the stable channel only. Since we distribute 2017.3 EAP builds in the edge channel, to update to the next EAP build, you'll need to invoke manually:

$snap refresh [pycharm-professional | pycharm-community] --edge

Read more about how snaps work and let us know about your experience using snap for installing and updating PyCharm 2017.3 EAP so we can consider snaps for our official stable releases. You can give us your feedback on Twitter or in the comments to this blog post.

Other improvements in this build:

If these features sound interesting to you, try them yourself:

Get PyCharm 2017.3 EAP 6

As a reminder, PyCharm EAP versions:

If you run into any issues with this version, or another version of PyCharm, please let us know on our YouTrack. If you have other suggestions or remarks, you can reach us on Twitter, or by commenting on the blog.

19 Oct 2017 7:07pm GMT

Lintel Technologies: What is milter?

Every one gets tons of email these days. This includes emails about super duper offers from amazon to princess and wealthy businessmen trying to offer their money to you from some African country that you have never heard of. In all these emails in your inbox there lies one or two valuable emails either from your friends, bank alerts, work related stuff. Spam is a problem that email service providers are battling for ages. There are a few opensource spam fighting tools available like SpamAssasin or SpamBayes.

What is milter ?

Simply put - milter is mail filtering technology. Its designed by sendmail project. Now available in other MTAs also. People historically used all kinds of solutions for filtering mails on servers using procmail or MTA specific methods. The current scene seems to be moving forward to sieve. But there is a huge difference between milter and sieve. Sieve comes in to picture when mail is already accepted by MTA and had been handed over to MDA. On the other hand milter springs into action in the mail receiving part of MTA. When a new connection is made by remote server to your MTA, your MTA will give you an opportunity to accept of reject the mail every step of the way from new connection, reception of each header, and reception of body.

simplified milter stages

" data-medium-file="http://s3.amazonaws.com/lintel-blogs-static-files/wp-content/uploads/2017/10/13131356/milter-stages-187x300.png" data-large-file="http://s3.amazonaws.com/lintel-blogs-static-files/wp-content/uploads/2017/10/13131356/milter-stages.png" class="size-full wp-image-868" src="http://s3.amazonaws.com/lintel-blogs-static-files/wp-content/uploads/2017/10/13131356/milter-stages.png" alt="milter stages " width="512" height="823" srcset="http://s3.amazonaws.com/lintel-blogs-static-files/wp-content/uploads/2017/10/13131356/milter-stages.png 512w, http://s3.amazonaws.com/lintel-blogs-static-files/wp-content/uploads/2017/10/13131356/milter-stages-187x300.png 187w" sizes="(max-width: 512px) 100vw, 512px" />milter protocol various stages

The above picture depicts simplified version of milter protocol working. Full details of milter protocol can be found here https://github.com/avar/sendmail-pmilter/blob/master/doc/milter-protocol.txt . Not only filtering; using milter, you can also modify message or change headers.

HOW DO I GET STARTED WITH CODING MILTER PROGRAM ?

If you want to get started in C you can use libmilter. For Python you have couple of options:

  1. pymilter - https://pythonhosted.org/milter/
  2. txmilter - https://github.com/flaviogrossi/txmilter

Postfix supports milter protocol. You can find every thing related to postfix's milter support in here - http://www.postfix.org/MILTER_README.html

WHY NOT SIEVE WHY MILTER ?

I found sieve to be rather limited. It doesn't offer too many options to implement complex logic. It was purposefully made like that. Also sieve starts at the end of mail reception process after mail is already accepted by MTA.

Coding milter program in your favorite programming language gives you full power and allows you to implement complex , creative stuff.

WATCHOUT!!!

When writing milter programs take proper care to return a reply to MTA quickly. Don't do long running tasks in milter program when the MTA is waiting for reply. This will have crazy side effects like remote parties submitting same mail multiple time filling up your inbox.

The post What is milter? appeared first on Lintel Technologies Blog.

19 Oct 2017 8:05am GMT

Lintel Technologies: What is milter?

Every one gets tons of email these days. This includes emails about super duper offers from amazon to princess and wealthy businessmen trying to offer their money to you from some African country that you have never heard of. In all these emails in your inbox there lies one or two valuable emails either from your friends, bank alerts, work related stuff. Spam is a problem that email service providers are battling for ages. There are a few opensource spam fighting tools available like SpamAssasin or SpamBayes.

What is milter ?

Simply put - milter is mail filtering technology. Its designed by sendmail project. Now available in other MTAs also. People historically used all kinds of solutions for filtering mails on servers using procmail or MTA specific methods. The current scene seems to be moving forward to sieve. But there is a huge difference between milter and sieve. Sieve comes in to picture when mail is already accepted by MTA and had been handed over to MDA. On the other hand milter springs into action in the mail receiving part of MTA. When a new connection is made by remote server to your MTA, your MTA will give you an opportunity to accept of reject the mail every step of the way from new connection, reception of each header, and reception of body.

simplified milter stages

" data-medium-file="http://s3.amazonaws.com/lintel-blogs-static-files/wp-content/uploads/2017/10/13131356/milter-stages-187x300.png" data-large-file="http://s3.amazonaws.com/lintel-blogs-static-files/wp-content/uploads/2017/10/13131356/milter-stages.png" class="size-full wp-image-868" src="http://s3.amazonaws.com/lintel-blogs-static-files/wp-content/uploads/2017/10/13131356/milter-stages.png" alt="milter stages " width="512" height="823" srcset="http://s3.amazonaws.com/lintel-blogs-static-files/wp-content/uploads/2017/10/13131356/milter-stages.png 512w, http://s3.amazonaws.com/lintel-blogs-static-files/wp-content/uploads/2017/10/13131356/milter-stages-187x300.png 187w" sizes="(max-width: 512px) 100vw, 512px" />milter protocol various stages

The above picture depicts simplified version of milter protocol working. Full details of milter protocol can be found here https://github.com/avar/sendmail-pmilter/blob/master/doc/milter-protocol.txt . Not only filtering; using milter, you can also modify message or change headers.

HOW DO I GET STARTED WITH CODING MILTER PROGRAM ?

If you want to get started in C you can use libmilter. For Python you have couple of options:

  1. pymilter - https://pythonhosted.org/milter/
  2. txmilter - https://github.com/flaviogrossi/txmilter

Postfix supports milter protocol. You can find every thing related to postfix's milter support in here - http://www.postfix.org/MILTER_README.html

WHY NOT SIEVE WHY MILTER ?

I found sieve to be rather limited. It doesn't offer too many options to implement complex logic. It was purposefully made like that. Also sieve starts at the end of mail reception process after mail is already accepted by MTA.

Coding milter program in your favorite programming language gives you full power and allows you to implement complex , creative stuff.

WATCHOUT!!!

When writing milter programs take proper care to return a reply to MTA quickly. Don't do long running tasks in milter program when the MTA is waiting for reply. This will have crazy side effects like remote parties submitting same mail multiple time filling up your inbox.

The post What is milter? appeared first on Lintel Technologies Blog.

19 Oct 2017 8:05am GMT

Talk Python to Me: #134 Python in Climate Science

What is the biggest challenge facing human civilization right now? Fake news, poverty, hunger? Yes, all of those are huge problems right now. Well, if climate change kicks in, you can bet it will amplify these problems and more. That's why it's critical that we get answers and fundamental models to help understand where we are, where we are going, and how we can improve things.

19 Oct 2017 8:00am GMT

Talk Python to Me: #134 Python in Climate Science

What is the biggest challenge facing human civilization right now? Fake news, poverty, hunger? Yes, all of those are huge problems right now. Well, if climate change kicks in, you can bet it will amplify these problems and more. That's why it's critical that we get answers and fundamental models to help understand where we are, where we are going, and how we can improve things.

19 Oct 2017 8:00am GMT

17 Oct 2017

feedDjango community aggregator: Community blog posts

Mercurial Mirror For Django 2.0 Branch

The first Beta was released today, so it seems a good day to start the mirror for the 2.0 branch of Django. For the record, main purposes of this mirror are: being a lightweight read-only repository to clone from for production servers hide the ugly git stuff behind a great mercurial interface The clone is […]

17 Oct 2017 11:22pm GMT

Mailchimp Integration

The Mailchimp API is extensive...

17 Oct 2017 7:35pm GMT

16 Oct 2017

feedDjango community aggregator: Community blog posts

Automating Dokku Setup with AWS Managed Services

Dokku is a great little tool. It lets you set up your own virtual machine (VM) to facilitate quick and easy Heroku-like deployments through a git push command. Builds are fast, and updating environment variables is easy. The problem is that Dokku includes all of your services on a single instance. When you run your database on the Dokku instance, you risk losing it (and any data that's not yet backed up) should your VM suddenly fail.

Enter Amazon Web Services (AWS). By creating your database via Amazon's Relational Database Service (RDS), you get the benefit of simple deploys along with the redundancy and automated failover that can be set up with RDS. AWS, of course, includes other managed services that might help reduce the need to configure and maintain extra services on your Dokku instance, such as ElastiCache and Elasticsearch.

I've previously written about managing your AWS container infrastructure with Python and described a new project I'm working on called AWS Web Stacks. Sparked by some conversations with colleagues at the Caktus office, I began wondering if it would be possible to use a Dokku instance in place of Elastic Beanstalk (EB) or Elastic Container Service (ECS) to help simplify deployments. It turns out that it is not only possible to use Dokku in place of EB or ECS in a CloudFormation stack, but doing so speeds up build and deployment times by an order of magnitude, all while substituting a simple, open source tool for what was previously a vendor-specific resource. This "CloudFormation-aware" Dokku instance accepts inputs via CloudFormation parameters, and watches the CloudFormation stack for updates to resources that might result in changes to its environment variables (such as DATABASE_URL).

The full code (a mere 277 lines as of the time of this post) is available on GitHub, but I think it's helpful to walk through it section by section to understand exactly how CloudFormation and Dokku interact. The original code and the CloudFormation templates in this post are written in troposphere, a library that lets you create CloudFormation templates in Python instead of writing JSON manually.

First, we create some parameters so we can configure the Dokku instance when the stack is created, rather than opening up an HTTP server to the public internet.

key_name = template.add_parameter(Parameter(
    "KeyName",
    Description="Name of an existing EC2 KeyPair to enable SSH access to "
                "the AWS EC2 instances",
    Type="AWS::EC2::KeyPair::KeyName",
    ConstraintDescription="must be the name of an existing EC2 KeyPair."
))

dokku_version = template.add_parameter(Parameter(
    "DokkuVersion",
    Description="Dokku version to install, e.g., \"v0.10.4\" (see "
                "https://github.com/dokku/dokku/releases).",
    Type="String",
    Default="v0.10.4",
))

dokku_web_config = template.add_parameter(Parameter(
    "DokkuWebConfig",
    Description="Whether or not to enable the Dokku web config "
                "(defaults to false for security reasons).",
    Type="String",
    AllowedValues=["true", "false"],
    Default="false",
))

dokku_vhost_enable = template.add_parameter(Parameter(
    "DokkuVhostEnable",
    Description="Whether or not to use vhost-based deployments "
                "(e.g., foo.domain.name).",
    Type="String",
    AllowedValues=["true", "false"],
    Default="true",
))

root_size = template.add_parameter(Parameter(
    "RootVolumeSize",
    Description="The size of the root volume (in GB).",
    Type="Number",
    Default="30",
))

ssh_cidr = template.add_parameter(Parameter(
    "SshCidr",
    Description="CIDR block from which to allow SSH access. Restrict "
                "this to your IP, if possible.",
    Type="String",
    Default="0.0.0.0/0",
))

Next, we create a mapping that allows us to look up the correct AMI for the latest Ubuntu 16.04 LTS release by AWS region:

template.add_mapping('RegionMap', {
    "ap-northeast-1": {"AMI": "ami-0417e362"},
    "ap-northeast-2": {"AMI": "ami-536ab33d"},
    "ap-south-1": {"AMI": "ami-df413bb0"},
    "ap-southeast-1": {"AMI": "ami-9f28b3fc"},
    "ap-southeast-2": {"AMI": "ami-bb1901d8"},
    "ca-central-1": {"AMI": "ami-a9c27ccd"},
    "eu-central-1": {"AMI": "ami-958128fa"},
    "eu-west-1": {"AMI": "ami-674cbc1e"},
    "eu-west-2": {"AMI": "ami-03998867"},
    "sa-east-1": {"AMI": "ami-a41869c8"},
    "us-east-1": {"AMI": "ami-1d4e7a66"},
    "us-east-2": {"AMI": "ami-dbbd9dbe"},
    "us-west-1": {"AMI": "ami-969ab1f6"},
    "us-west-2": {"AMI": "ami-8803e0f0"},
})

The AMIs can be located manually via https://cloud-images.ubuntu.com/locator/ec2/, or programmatically via the JSON-like data available at https://cloud-images.ubuntu.com/locator/ec2/releasesTable.

To allow us to access other resources (such as the S3 buckets and CloudWatch Logs group) created by AWS Web Stacks we also need to set up an IAM instance role and instance profile for our Dokku instance:

instance_role = iam.Role(
     "ContainerInstanceRole",
     template=template,
     AssumeRolePolicyDocument=dict(Statement=[dict(
         Effect="Allow",
         Principal=dict(Service=["ec2.amazonaws.com"]),
         Action=["sts:AssumeRole"],
     )]),
     Path="/",
     Policies=[
         assets_management_policy,  # defined in assets.py
         logging_policy,  # defined in logs.py
     ]
 )

 instance_profile = iam.InstanceProfile(
     "ContainerInstanceProfile",
     template=template,
     Path="/",
     Roles=[Ref(instance_role)],
 )

Next, let's set up a security group for our instance, so we can limit SSH access only to our IP(s) and open only ports 80 and 443 to the world:

security_group = template.add_resource(ec2.SecurityGroup(
    'SecurityGroup',
    GroupDescription='Allows SSH access from SshCidr and HTTP/HTTPS '
                     'access from anywhere.',
    VpcId=Ref(vpc),
    SecurityGroupIngress=[
        ec2.SecurityGroupRule(
            IpProtocol='tcp',
            FromPort=22,
            ToPort=22,
            CidrIp=Ref(ssh_cidr),
        ),
        ec2.SecurityGroupRule(
            IpProtocol='tcp',
            FromPort=80,
            ToPort=80,
            CidrIp='0.0.0.0/0',
        ),
        ec2.SecurityGroupRule(
            IpProtocol='tcp',
            FromPort=443,
            ToPort=443,
            CidrIp='0.0.0.0/0',
        ),
    ]
))

Since EC2 instances themselves are ephemeral, let's create an Elastic IP that we can keep assigned to our current Dokku instance, in the event the instance needs to be recreated for some reason:

eip = template.add_resource(ec2.EIP("Eip"))

Now for the EC2 instance itself. This resource makes up nearly half the template, so we'll take it section by section. The first part is relatively straightforward. We create the instance with the correct AMI for our region; the instance type, SSH public key, and root volume size configured in the stack parameters; and the security group, instance profile, and VPC subnet we defined elsewhere in the stack:

ec2_instance_name = 'Ec2Instance'
ec2_instance = template.add_resource(ec2.Instance(
    ec2_instance_name,
    ImageId=FindInMap("RegionMap", Ref("AWS::Region"), "AMI"),
    InstanceType=container_instance_type,
    KeyName=Ref(key_name),
    SecurityGroupIds=[Ref(security_group)],
    IamInstanceProfile=Ref(instance_profile),
    SubnetId=Ref(container_a_subnet),
    BlockDeviceMappings=[
        ec2.BlockDeviceMapping(
            DeviceName="/dev/sda1",
            Ebs=ec2.EBSBlockDevice(
                VolumeSize=Ref(root_size),
            )
        ),
    ],
    # ...
    Tags=Tags(
        Name=Ref("AWS::StackName"),
    ),
)

Next, we define a CreationPolicy that allows the instance to alert CloudFormation when it's finished installing Dokku:

ec2_instance = template.add_resource(ec2.Instance(
    # ...
    CreationPolicy=CreationPolicy(
        ResourceSignal=ResourceSignal(
            Timeout='PT10M',  # 10 minutes
        ),
    ),
    # ...
)

The UserData section defines a script that is run when the instance is initially created. This is the only time this script is run. In it, we install the CloudFormation helper scripts, execute a set of scripts that we define later, and signal to CloudFormation that the instance creation is finished:

ec2_instance = template.add_resource(ec2.Instance(
    # ...
    UserData=Base64(Join('', [
        '#!/bin/bash\n',
        # install cfn helper scripts
        'apt-get update\n',
        'apt-get -y install python-pip\n',
        'pip install https://s3.amazonaws.com/cloudformation-examples/'
        'aws-cfn-bootstrap-latest.tar.gz\n',
        'cp /usr/local/init/ubuntu/cfn-hup /etc/init.d/cfn-hup\n',
        'chmod +x /etc/init.d/cfn-hup\n',
        # don't start cfn-hup yet until we install cfn-hup.conf
        'update-rc.d cfn-hup defaults\n',
        # call our "on_first_boot" configset (defined below):
        'cfn-init --stack="', Ref('AWS::StackName'), '"',
        ' --region=', Ref('AWS::Region'),
        ' -r %s -c on_first_boot\n' % ec2_instance_name,
        # send the exit code from cfn-init to our CreationPolicy:
        'cfn-signal -e $? --stack="', Ref('AWS::StackName'), '"',
        ' --region=', Ref('AWS::Region'),
        ' --resource %s\n' % ec2_instance_name,
    ])),
    # ...
)

Finally, in the MetaData section, we define a set of cloud-init scripts that (a) install Dokku, (b) configure global Dokku environment variables with the environment variables based on our stack (e.g., DATABASE_URL, CACHE_URL, ELASTICSEARCH_ENDPOINT, etc.), (c) install some configuration files needed by the cfn-hup service, and (d) start the cfn-hup service:

ec2_instance = template.add_resource(ec2.Instance(
    # ...
    Metadata=cloudformation.Metadata(
        cloudformation.Init(
            cloudformation.InitConfigSets(
                on_first_boot=['install_dokku', 'set_dokku_env', 'start_cfn_hup'],
                on_metadata_update=['set_dokku_env'],
            ),
            install_dokku=cloudformation.InitConfig(
                commands={
                    '01_fetch': {
                        'command': Join('', [
                            'wget https://raw.githubusercontent.com/dokku/dokku/',
                            Ref(dokku_version),
                            '/bootstrap.sh',
                        ]),
                        'cwd': '~',
                    },
                    '02_install': {
                        'command': 'sudo -E bash bootstrap.sh',
                        'env': {
                            'DOKKU_TAG': Ref(dokku_version),
                            'DOKKU_VHOST_ENABLE': Ref(dokku_vhost_enable),
                            'DOKKU_WEB_CONFIG': Ref(dokku_web_config),
                            'DOKKU_HOSTNAME': domain_name,
                            # use the key configured by key_name
                            'DOKKU_KEY_FILE': '/home/ubuntu/.ssh/authorized_keys',
                            # should be the default, but be explicit just in case
                            'DOKKU_SKIP_KEY_FILE': 'false',
                        },
                        'cwd': '~',
                    },
                },
            ),
            set_dokku_env=cloudformation.InitConfig(
                commands={
                    '01_set_env': {
                        # redirect output to /dev/null so we don't write
                        # environment variables to log file
                        'command': 'dokku config:set --global {} >/dev/null'.format(
                            ' '.join(['=$'.join([k, k]) for k in dict(environment_variables).keys()]),
                        ),
                        'env': dict(environment_variables),
                    },
                },
            ),
            start_cfn_hup=cloudformation.InitConfig(
                commands={
                    '01_start': {
                        'command': 'service cfn-hup start',
                    },
                },
                files={
                    '/etc/cfn/cfn-hup.conf': {
                        'content': Join('', [
                            '[main]\n',
                            'stack=', Ref('AWS::StackName'), '\n',
                            'region=', Ref('AWS::Region'), '\n',
                            'umask=022\n',
                            'interval=1\n',  # check for changes every minute
                            'verbose=true\n',
                        ]),
                        'mode': '000400',
                        'owner': 'root',
                        'group': 'root',
                    },
                    '/etc/cfn/hooks.d/cfn-auto-reloader.conf': {
                        'content': Join('', [
                            # trigger the on_metadata_update configset on any
                            # changes to Ec2Instance metadata
                            '[cfn-auto-reloader-hook]\n',
                            'triggers=post.update\n',
                            'path=Resources.%s.Metadata\n' % ec2_instance_name,
                            'action=/usr/local/bin/cfn-init',
                            ' --stack=', Ref('AWS::StackName'),
                            ' --resource=%s' % ec2_instance_name,
                            ' --configsets=on_metadata_update',
                            ' --region=', Ref('AWS::Region'), '\n',
                            'runas=root\n',
                        ]),
                        'mode': '000400',
                        'owner': 'root',
                        'group': 'root',
                    },
                },
            ),
        ),
    ),
    # ...
)

The install_dokku and start_cfn_hup scripts are configured to run only the first time the instance boots, whereas the set_dokku_env script is configured to run any time any metadata associated with the EC2 instance changes.

Want to give it a try? Before creating a stack, you'll need to upload your SSH public key to the Key Pairs section of the AWS console so you can select it via the KeyName parameter. Click the Launch Stack button below to create your own stack on AWS. For help filling in the CloudFormation parameters, refer to the Specify Details section of the post on managing your AWS container infrastructure with Python. If you create a new account to try it out, or if your account is less than 12 months old and you're not already using free tier resources, the default instance types in the stack should fit within the free tier, and unneeded services can be disabled by selecting (none) for the instance type.

Dokku-No-NAT

Once the stack is set up, you can deploy to it as you would to any Dokku instance (or to Heroku proper):

ssh dokku@<your domain or IP> apps:create python-sample
git clone https://github.com/heroku/python-sample.git
cd python-sample
git remote add dokku dokku@<your domain or IP>:python-sample
git push dokku master

Alternatively, fork the aws-web-stacks repo on GitHub and adjust it to suit your needs. Contributions welcome.

Good luck and have fun!

16 Oct 2017 6:30pm GMT

12 Oct 2017

feedPlanet Twisted

Jonathan Lange: SPAKE2 in Haskell: How Haskell Helped

Porting SPAKE2 from Python to Haskell helped me understand how SPAKE2 worked, and a large part of that is due to specific features of Haskell.

What's this again?

As a favour for Jean-Paul, I wrote a Haskell library implementing SPAKE2, so he could go about writing a magic-wormhole client. This turned out to be much more work than I expected. Although there was a perfectly decent Python implementation for me to crib from, my ignorance of cryptography and the lack of standards documentation for SPAKE2 made it difficult for me to be sure I was doing the right thing.

One of the things that made it easier was the target language: Haskell. Here's how.

Elliptic curves-how do they work?

The arithmetic around elliptic curves can be slow. There's a trick where you can do the operations in 4D space, rather than 2D space, which somehow makes the operations faster. Brian's code calls these "extended points". The 2D points are called "affine points".

However, there's a catch. Many of the routines can generate extended points that aren't on the curve for that we're working in, which makes them useless (possibly dangerous) for our cryptography.

The Python code deals with this using runtime checks and documentation. There are many checks of isoncurve, and comments like extended->extended.

Because I have no idea what I'm doing, I wanted to make sure I got this right.

So when I defined ExtendedPoint, I put whether or not the point is on the curve (in the group) into the type.

e.g.

-- | Whether or not an extended point is a member of Ed25519.
data GroupMembership = Unknown | Member

-- | A point that might be a member of Ed25519.
data ExtendedPoint (groupMembership :: GroupMembership)
  = ExtendedPoint
  { x :: !Integer
  , y :: !Integer
  , z :: !Integer
  , t :: !Integer
  } deriving (Show)

This technique is called phantom types.

It means we can write functions with signatures like this:

isExtendedZero :: ExtendedPoint irrelevant -> Bool

Which figures out whether an extended point is zero, and we don't care whether it's in the group or not.

Or functions like this:

doubleExtendedPoint
  :: ExtendedPoint preserving
  -> ExtendedPoint preserving

Which says that whether or not the output is in the group is determined entirely by whether the input is in the group.

Or like this:

affineToExtended
  :: AffinePoint
  -> ExtendedPoint 'Unknown

Which means that we know that we don't know whether a point is on the curve after we've projected it from affine to extended.

And we can very carefully define functions that decide whether an extended point is in the group or not, which have signatures that look like this:

ensureInGroup
  :: ExtendedPoint 'Unknown
  -> Either Error (ExtendedPoint 'Member)

This pushes our documentation and runtime checks into the type system. It means the compiler will tell me when I accidentally pass an extended point that's not a member (or not proven to be a member) to something that assumes it is a member.

When you don't know what you are doing, this is hugely helpful. It can feel a bit like a small child trying to push a star-shaped thing through the square-shaped hole. The types are the holes that guide how you insert code and values.

What do we actually need?

Python famously uses "duck typing". If you have a function that uses a value, then any value that has the right methods and attributes will work, probably.

This is very useful, but it can mean that when you are trying to figure out whether your value can be used, you have to resort to experimentation.

inbound_elem = g.bytes_to_element(self.inbound_message)
if inbound_elem.to_bytes() == self.outbound_message:
   raise ReflectionThwarted
pw_unblinding = self.my_unblinding().scalarmult(-self.pw_scalar)
K_elem = inbound_elem.add(pw_unblinding).scalarmult(self.xy_scalar)

Here, g is a group. What does it need to support? What kinds of things are its elements? How are they related?

Here's what the type signature for the corresponding Haskell function looks like:

generateKeyMaterial
  :: AbelianGroup group
  => Spake2Exchange group  -- ^ An initiated SPAKE2 exchange
  -> Element group  -- ^ The outbound message from the other side (i.e. inbound to us)
  -> Element group -- ^ The final piece of key material to generate the session key.

This makes it explicit that we need something that implements AbelianGroup, which is an interface with defined methods.

If we start to rely on something more, the compiler will tell us. This allows for clear boundaries.

When reverse engineering the Python code, it was never exactly clear whether a function in a group implementation was meant to be public or private.

By having interfaces (type classes) enforced by the compiler, this is much more clear.

What comes first?

The Python SPAKE2 code has a bunch of assertions to make sure that one method isn't called before another.

In particular, you really shouldn't generate the key until you've generated your message and received one from the other side.

Using Haskell, I could put this into the type system, and get the compiler to take care of it for me.

We have a function that initiates the exchange, startSpake2:

-- | Initiate the SPAKE2 exchange. Generates a secret (@xy@) that will be held
-- by this side, and transmitted to the other side in "blinded" form.
startSpake2
  :: (AbelianGroup group, MonadRandom randomly)
  => Spake2 group
  -> randomly (Spake2Exchange group)

This takes a Spake2 object for a particular AbelianGroup, which has our password scalar and protocol parameters, and generates a Spake2Exchange for that group.

We have another function that computes the outbound message:

-- | Determine the element (either \(X^{\star}\) or \(Y^{\star}\)) to send to the other side.
computeOutboundMessage
  :: AbelianGroup group
  => Spake2Exchange group
  -> Element group

This takes a Spake2Exchange as its input. This means it is _impossible_ for us to call it unless we have already called startSpake2.

We don't need to write tests for what happens if we try to call it before we call startSpake2, in fact, we cannot write such tests. They won't compile.

Psychologically, this helped me immensely. It's one less thing I have to worry about getting right, and that frees me up to explore other things.

It also meant I had to do less work to be satisfied with correctness. This one line type signature replaces two or three tests.

We can also see that startSpake2 is the only thing that generates random numbers. This means we know that computeOutboundMessage will always return the same element for the same initiated exchange.

Conclusion

Haskell helped me be more confident in the correctness of my code, and also gave me tools to explore the terrain further.

It's easy to think of static types as being a constraint the binds you and prevents you from doing wrong things, but an expressive type system can help you figure out what code to write.

12 Oct 2017 11:00pm GMT

10 Oct 2017

feedPlanet Twisted

Itamar Turner-Trauring: The lone and level sands of software

There's that moment late at night when you can't sleep, and you're so tired you can't even muster the energy to check the time. So you stare blindly at the ceiling and look back over your life, and you think: "Did I really accomplish anything? Was my work worth anything at all?"

I live in a 140-year-old house, a house which has outlasted its architect and builders, and quite possibly will outlast me. But having spent the last twenty years of my life building software, I can't really hope to have my own work live on. In those late night moments I sometimes believe that my resume, like that of most programmers, should open with a quote from Shelley's mocking poem:

My name is Ozymandias, King of Kings;
Look on my Works, ye Mighty, and despair!
Nothing beside remains. Round the decay
Of that colossal Wreck, boundless and bare
The lone and level sands stretch far away.

Who among us has not had projects canceled, rewritten from scratch, obsoleted, abandoned or discarded? Was that code worth writing, or was all that effort just a futile waste?

Decay, boundless and bare

Consider some of the projects I've worked on. I've been writing software for 20+ years at this point, which means I've accumulated many decayed wrecks:

I could go on, but that would just make me sadder. This is not to say none of my software lives on: there are open source projects, mostly, that have survived quite a whole, and will hopefully continue for many more. But I've spent years of my life working on software that is dead and gone.

How about you? How much of your work has survived?

Which yet survive

So what do you have left, after all these years of effort? You get paid for your work, of course, and getting paid has its benefits. And if you're lucky your software proved valuable to someone, for a while at least, before it was replaced or shut down. For me at least that's worth even more than the money.

But there's something else you gain, something you get to take with you when the money is spent and your users have moved on: knowledge, skills, and mistakes you'll know how to avoid next time. Every failure I've listed above, every mistake I've made, every preventable rewrite, is something I hope to avoid the next time around.

And while software mostly dies quickly, the ideas live on, and if we pay attention it'll be the good ideas that survive. I've borrowed ideas for my own logging library from software that is now dead. If my library dies one day, and no doubt it will, I can only hope its own contributions will be revived by one of my users, or even someone who just half-remembers a better way of doing things.

Dead but not forgotten

Since the ultimate benefit of most software projects is what you learned from them, it's important to make sure you're actually learning. It's easy to just do your work and move on. If you're not careful you'll forget to look for the mistakes to avoid next time, and you won't notice the ideas that are the only thing that can truly survive in the long run.

As for me, I've been writing a weekly newsletter where I share my mistakes, some mentioned above, others in my current work: you can gain from my failures, without all the wasted effort.

10 Oct 2017 4:00am GMT

04 Oct 2017

feedPlanet Twisted

Itamar Turner-Trauring: Technical skills alone won't make you productive

When you're just starting out in your career as a programmer, the variety and number of skills you think you need to learn can be overwhelming. And working with colleagues who can produce far more than you can be intimidating, demoralizing, and confusing: how do they do it? How can some programmers create so much more?

The obvious answer is that these productive programmers have technical skills. They know more programming languages, more design patterns, more architectural styles, more testing techniques. And all these do help: they'll help you find a bug faster, or implement a solution that is more elegant and efficient.

But the obvious answer is insufficient: technical skills are necessary, but they're not enough, and they often don't matter as much as you'd think. Productivity comes from avoiding unnecessary work, and unnecessary work is a temptation you'll encounter long before you reach the point of writing code.

In this post I'm going to cover some of the ways you can be unproductive, from most to least unproductive. As you'll see, technical programming skills do help, but only much further along in the process of software development.

How to be unproductive

1. Destructive work

The most wasteful and unproductive thing you can do is work on something that hurts others, or that you personally think is wrong. Instead of creating, you're actively destroying. Instead of making the world a better place, you're making the world a worse place. The better you are at your job, the less productive you are.

Being productive, then, starts with avoiding destructive work.

2. Work that doesn't further your goals

You go to work every day, and you're bored. You're not learning anything, you're not paid well, you don't care one way or another about the results of your work… why bother at all?

Productivity can only be defined against a goal: you're trying to produce some end result. If you're working on something that doesn't further your own goals-making money, learning, making the world a better place-then this work isn't productive for you.

To be productive, figure out your own goals, and then find work that will align your goals with those of your employer.

3. Building something no one wants

You're working for a startup, and it's exciting, hard work, churning out code like there's no tomorrow. Finally, the big day comes: you launch your product to great fanfare. And then no one shows up. Six months later the company shuts down, and you're looking for a new job.

This failure happens at big companies too, and it happens to individuals building their side project: building a product that the world doesn't need. It doesn't matter how good a programmer you are: if you're working on solving a problem that no one has, you're doing unnecessary work.

Personally, I've learned a lot from Stacking the Bricks about how to avoid this form of unproductive work.

4. Running out of time

Even if you're working on a real problem, on a problem you understand well, your work is for naught if you fail to solve the problem before you run out of time or money. Technical skills will help you come up with a faster, simpler solution, but they're not enough. You also need to avoid digressions, unnecessary work that will slow you down.

The additional skills you need here are project planning skills. For example:

5. Solving the symptoms of a problem, instead of the root cause

Finally, you've gotten to the point of solving a problem! Unfortunately, you haven't solved the root cause because you haven't figured out why you're doing your work. You've added a workaround, instead of discovering the bug, or you've made a codepath more efficient, when you could have just ripped out an unused feature altogether.

Whenever you're given a task, ask why you're doing it, what success means, and keep digging until you've found the real problem.

6. Solving a problem inefficiently

You've solved the right problem, on time and on budget! Unfortunately, your design wasn't as clean and efficient as it could have been. Here, finally, technical skills are the most important skills.

Beyond technical skills

If you learn the skills you need to be productive-starting with goals, prioritizing, avoiding digressions, and so on-your technical skills will also benefit. Learning technical skills is just another problem to solve: you need to learn the most important skills, with a limited amount of time. When you're thinking about which skills to learn next, take some time to consider which skills you're missing that aren't a programming language or a web framework.

Here's one suggestion: during my 20+ years as a programmer I've made all but the first of the mistakes I've listed above. You can hear these stories, and learn how to avoid my mistakes, by signing up for my weekly Software Clown email.

04 Oct 2017 4:00am GMT

11 Feb 2016

feedPlanet TurboGears

Christpher Arndt: Organix Roland JX-3P MIDI Expansion Kit

Foreign visitors: to download the Novation Remote SL template for the Roland JX-3P with the Organix MIDI Upgrade, see the link at the bottom of this post. Zu meinem letzten Geburtstag habe ich mir selbst einen Roland JX-3P geschenkt, inklusive einem DT200-Programmer (ein PG-200 Klon). Der JX-3P ist ein 6-stimmiger analoger Polysynth von 1983 und […]

11 Feb 2016 8:42pm GMT

13 Jan 2016

feedPlanet TurboGears

Christpher Arndt: Anmeldung für das PythonCamp 2016 ab Freitag, 15.1.2016

PythonCamp 2016 Kostenloser Wissensaustausch rund um Python (The following is an announcement for a Python "Un-Conference" in Cologne, Germany and therefor directed at a German-speaking audience.) Liebe Python-Fans, es ist wieder soweit: Am Freitag, den 15. Januar öffnen wir die Online-Anmeldung für Teilnehmer des PythonCamps 2016! Die nunmehr siebte Ausgabe des PythonCamps wird erneut durch […]

13 Jan 2016 3:00pm GMT

02 Nov 2015

feedPlanet TurboGears

Matthew Wilson: Mary Dunbar is the best candidate for Cleveland Heights Council

I'll vote for Mary Dunbar tomorrow in the Cleveland Heights election.

Here's why:

02 Nov 2015 5:14pm GMT

03 Aug 2012

feedPySoy Blog

Juhani Åhman: YA Update

Managed to partyally fix the shading rendering issues with the examples.I reckon the rest of rendering issues are opengl ES related, and not something in libsoy side.
I don't know opengl (ES) very well, so i didn't attempt to fix any further.

I finished implementing a rudimentary pointer controller in pysoy's Client.
There is a pointer.py example program for testing it. Unfortunately it keeps crashing once in a while.
I reckon the problem is something with soy.atoms.Position. Regardless, the pointer controller works.

I started to work on getting keyboard controller to work too, and of course mouse buttons for the pointer,
but I got stuck when writing the python bindings for Genie's events (signals). There's no connect method in pysoy, so maybe that needs to implemented, or then some other solution. I will look into this later.

Plan for this week is to finish documenting bodies, scenes and widgets. I'm about 50% done, and it should be straightforward. Next week I'm finally going to attempt to set up Sphinx and generate readable documentation. I reckon I need to refactor many of the docstrings as well.

03 Aug 2012 12:27pm GMT

10 Jul 2012

feedPySoy Blog

Mayank Singh: Mid-term and dualshock 3

Now that SoC mid-term has arrived, here's bit of update about what I have done till now. The wiimote xinput driver IR update is almost done. Though just like it can said about any piece of software, it's never fully complete.
I also corrected the code for Sphere in the libsoy repository to render an actual sphere.
For now I have started up on an integration of dualshock3 controller. I am currently studying the code given here: http://www.pabr.org/sixlinux/sixlinux.en.html and trying to understand how the dualshock works. I also need to write a controller class to be able to grab and move objects around without the help from the physics engine.

10 Jul 2012 3:00pm GMT

04 Jul 2012

feedPySoy Blog

Juhani Åhman: Weeks 5-7 update

I've have mostly finished writing unit tests for atoms now.
I didn't write tests for Morphs tough, since that seem to be still in WIP.
However, I did encounter a rare memory corruption bug that I'm unable to fix at this point,
because I don't know how to debug it properly.
I can't find the location where the error occurrs.

I'm going to spend rest of this week writing doctests and hopefully getting more examples to work.

04 Jul 2012 9:04am GMT

10 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: King Willams Town Bahnhof

Gestern musste ich morgens zur Station nach KWT um unsere Rerservierten Bustickets für die Weihnachtsferien in Capetown abzuholen. Der Bahnhof selber ist seit Dezember aus kostengründen ohne Zugverbindung - aber Translux und co - die langdistanzbusse haben dort ihre Büros.


Größere Kartenansicht




© benste CC NC SA

10 Nov 2011 10:57am GMT

09 Nov 2011

feedPlanet Plone

Andreas Jung: Produce & Publish Plone Client Connector released as open-source

09 Nov 2011 9:30pm GMT

feedPython Software Foundation | GSoC'11 Students

Benedict Stein

Niemand ist besorgt um so was - mit dem Auto fährt man einfach durch, und in der City - nahe Gnobie- "ne das ist erst gefährlich wenn die Feuerwehr da ist" - 30min später auf dem Rückweg war die Feuerwehr da.




© benste CC NC SA

09 Nov 2011 8:25pm GMT

feedPlanet Plone

ACLARK.NET, LLC: Plone secrets: Episode 4 – Varnish in front

This just in from the production department: use Varnish. (And please forgive the heavily meme-laden approach to describing these techniques :-) .)

Cache ALL the hosts

Our ability to use Varnish in production is no secret by now, or at least it shouldn't be. What is often less clear is exactly how to use it. One way I like[1], is to run Varnish on your public IP port 80 and make Apache listen on your private IP port 80. Then proxy from Varnish to Apache and enjoy easy caching goodness on all your virtual hosts in Apache.

Configuration

This should require less than five minutes of down time to implement. First, configure the appropriate settings. (Well, first install Apache and Varnish if you haven't already: `aptitude install varnish apache2` on Ubuntu Linux[0].)

Varnish

To modify the listen IP address and port, we typically edit a file like /etc/default/varnish (in Ubuntu). However you do it, configure the equivalent of the following on your system:

DAEMON_OPTS="-a 174.143.252.11:80 \
             -T localhost:6082 \
             -f /etc/varnish/default.vcl \
             -s malloc,256m"

This environment variable is then passed to varnishd on the command line. Next, pass traffic to Apache like so (in /etc/varnish/default.vcl on Ubuntu):

backend default {
 .host = "127.0.0.1";
 .port = "80";
 }

Now on to Apache.

Please note that the syntax above is for Varnish 3.x and the syntax has (annoyingly) changed from 2.x to 3.x.

Apache

The Apache part is a bit simpler. You just need to change the listen port (on Ubuntu this is done in /etc/apache2/ports.conf), typically from something like:

Listen *:80

to:

Listen 127.0.0.1:80

Restart ALL the services

Now restart both services. If all goes well you shouldn't notice any difference, except better performance, and when you make a website change and need to clear the cache[2]. For this, I rely on telnetting to the varnish port and issuing the `ban.url` command (formerly `url.purge` in 2.x):

$ telnet localhost 6082
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
200 205     
-----------------------------
Varnish Cache CLI 1.0
-----------------------------
Linux,2.6.35.4-rscloud,x86_64,-smalloc,-smalloc,-hcritbit

Type 'help' for command list.
Type 'quit' to close CLI session.

ban.url /
200 0

Cache ALL the disks

This site has Varnish and Apache configured as described in this article. It also has disk caching in Apache enabled, thanks to Elizabeth Leddy's article:

As a result, it's PEPPY AS THE DICKENS™ on a 512MB "slice" (Cloud server) from Rackspace Cloud. And now you know yet another "Plone secret". Now go make your Plone sites faster, and let me know how it goes in the comments section below.

Notes

[0] Using the latest distribution, "oneric".

[1] I first saw this technique at NASA when NASA Science was powered by Plone; I found it odd at the time but years later it makes perfect sense.

[2] Ideally you'd configure this in p.a.caching, but I've not been able to stomach this yet.


09 Nov 2011 5:50pm GMT

feedPlanet Zope.org

Updated MiniPlanet, now with meta-feed

My MiniPlanet Zope product has been working steady and stable for some years, when suddenly a user request came along. Would it be possible to get a feed of all the items in a miniplanet? With this update it became possible. MiniPlanet is an old-styl...

09 Nov 2011 9:41am GMT

08 Nov 2011

feedPlanet Plone

Max M: How to export all redirects from portal_redirection in an older Plone site

Just add the method below to the RedirectionTool and call it from the browser as:

http://localhost:8080/site/portal_redirection/getAllRedirects

Assuming that the site is running at loaclhost:8080 that is :-S

That will show a list of redirects that can be imported into plone 4.x


security.declareProtected(View, 'getAllRedirects')
def getAllRedirects(self):
"get'm'all"
result = []
reference_tool = getToolByName(self, 'reference_catalog')
for k,uuid in self._redirectionmap.items():
obj = reference_tool.lookupObject(uuid)
if obj is None:
print 'could not find redirect from: %s to %s' % (k, uuid)
else:
path = '/'.join(('',)+obj.getPhysicalPath()[2:])
result.append( '%s,%s' % (k,path) )
return '\n'.join(result)

08 Nov 2011 2:58pm GMT

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Brai Party

Brai = Grillabend o.ä.

Die möchte gern Techniker beim Flicken ihrer SpeakOn / Klinke Stecker Verzweigungen...

Die Damen "Mamas" der Siedlung bei der offiziellen Eröffnungsrede

Auch wenn weniger Leute da waren als erwartet, Laute Musik und viele Leute ...

Und natürlich ein Feuer mit echtem Holz zum Grillen.

© benste CC NC SA

08 Nov 2011 2:30pm GMT

07 Nov 2011

feedPlanet Zope.org

Welcome to Betabug Sirius

It has been quite some time that I announced_ that I'd be working as a freelancer. Lots of stuff had to be done in that time, but finally things are ready. I've founded my own little company and set up a small website: Welcome to Betabug Sirius!

07 Nov 2011 9:26am GMT

03 Nov 2011

feedPlanet Zope.org

Assertion helper for zope.testbrowser and unittest

zope.testbrowser is a valuable tool for integration tests. Historically, the Zope community used to write quite a lot of doctests, but we at gocept have found them to be rather clumsy and too often yielding neither good tests nor good documentation. That's why we don't use doctest much anymore, and prefer plain unittest.TestCases instead. However, doctest has one very nice feature, ellipsis matching, that is really helpful for checking HTML output, since you can only make assertions about the parts that interest you. For example, given this kind of page:

>>> print browser.contents
<html>
  <head>
    <title>Simple Page</title>
  </head>
  <body>
    <h1>Simple Page</h1>
  </body>
</html>

If all you're interested in is that the <h1> is rendered properly, you can simply say:

>>> print browser.contents
<...<h1>Simple Page</h1>...

We've now ported this functionality to unittest, as assertEllipsis, in gocept.testing. Some examples:

self.assertEllipsis('...bar...', 'foo bar qux')
# -> nothing happens

self.assertEllipsis('foo', 'bar')
# -> AssertionError: Differences (ndiff with -expected +actual):
     - foo
     + bar

self.assertNotEllipsis('foo', 'foo')
# -> AssertionError: "Value unexpectedly matches expression 'foo'."

To use it, inherit from gocept.testing.assertion.Ellipsis in addition to unittest.TestCase.


03 Nov 2011 7:19am GMT

19 Nov 2010

feedPlanet CherryPy

Robert Brewer: logging.statistics

Statistics about program operation are an invaluable monitoring and debugging tool. How many requests are being handled per second, how much of various resources are in use, how long we've been up. Unfortunately, the gathering and reporting of these critical values is usually ad-hoc. It would be nice if we had 1) a centralized place for gathering statistical performance data, 2) a system for extrapolating that data into more useful information, and 3) a method of serving that information to both human investigators and monitoring software. I've got a proposal. Let's examine each of those points in more detail.

Data Gathering

Just as Python's logging module provides a common importable for gathering and sending messages, statistics need a similar mechanism, and one that does not require each package which wishes to collect stats to import a third-party module. Therefore, we choose to re-use the logging module by adding a statistics object to it.

That logging.statistics object is a nested dict:

import logging
if not hasattr(logging, 'statistics'): logging.statistics = {}

It is not a custom class, because that would 1) require apps to import a third-party module in order to participate, 2) inhibit innovation in extrapolation approaches and in reporting tools, and 3) be slow. There are, however, some specifications regarding the structure of the dict.

    {
   +----"SQLAlchemy": {
   |        "Inserts": 4389745,
   |        "Inserts per Second":
   |            lambda s: s["Inserts"] / (time() - s["Start"]),
   |  C +---"Table Statistics": {
   |  o |        "widgets": {-----------+
 N |  l |            "Rows": 1.3M,      | Record
 a |  l |            "Inserts": 400,    |
 m |  e |        },---------------------+
 e |  c |        "froobles": {
 s |  t |            "Rows": 7845,
 p |  i |            "Inserts": 0,
 a |  o |        },
 c |  n +---},
 e |        "Slow Queries":
   |            [{"Query": "SELECT * FROM widgets;",
   |              "Processing Time": 47.840923343,
   |              },
   |             ],
   +----},
    }

The logging.statistics dict has strictly 4 levels. The topmost level is nothing more than a set of names to introduce modularity. If SQLAlchemy wanted to participate, it might populate the item logging.statistics['SQLAlchemy'], whose value would be a second-layer dict we call a "namespace". Namespaces help multiple emitters to avoid collisions over key names, and make reports easier to read, to boot. The maintainers of SQLAlchemy should feel free to use more than one namespace if needed (such as 'SQLAlchemy ORM').

Each namespace, then, is a dict of named statistical values, such as 'Requests/sec' or 'Uptime'. You should choose names which will look good on a report: spaces and capitalization are just fine.

In addition to scalars, values in a namespace MAY be a (third-layer) dict, or a list, called a "collection". For example, the CherryPy StatsTool keeps track of what each worker thread is doing (or has most recently done) in a 'Worker Threads' collection, where each key is a thread ID; each value in the subdict MUST be a fourth dict (whew!) of statistical data about
each thread. We call each subdict in the collection a "record". Similarly, the StatsTool also keeps a list of slow queries, where each record contains data about each slow query, in order.

Values in a namespace or record may also be functions, which brings us to:

Extrapolation

def extrapolate_statistics(scope):
    """Return an extrapolated copy of the given scope."""
    c = {}
    for k, v in scope.items():
        if isinstance(v, dict):
            v = extrapolate_statistics(v)
        elif isinstance(v, (list, tuple)):
            v = [extrapolate_statistics(record) for record in v]
        elif callable(v):
            v = v(scope)
        c[k] = v
    return c

The collection of statistical data needs to be fast, as close to unnoticeable as possible to the host program. That requires us to minimize I/O, for example, but in Python it also means we need to minimize function calls. So when you are designing your namespace and record values, try to insert the most basic scalar values you already have on hand.

When it comes time to report on the gathered data, however, we usually have much more freedom in what we can calculate. Therefore, whenever reporting tools fetch the contents of logging.statistics for reporting, they first call extrapolate_statistics (passing the whole statistics dict as the only argument). This makes a deep copy of the statistics dict so that the reporting tool can both iterate over it and even change it without harming the original. But it also expands any functions in the dict by calling them. For example, you might have a 'Current Time' entry in the namespace with the value "lambda scope: time.time()". The "scope" parameter is the current namespace dict (or record, if we're currently expanding one of those instead), allowing you access to existing static entries. If you're truly evil, you can even modify more than one entry at a time.

However, don't try to calculate an entry and then use its value in further extrapolations; the order in which the functions are called is not guaranteed. This can lead to a certain amount of duplicated work (or a redesign of your schema), but that's better than complicating the spec.

After the whole thing has been extrapolated, it's time for:

Reporting

A reporting tool would grab the logging.statistics dict, extrapolate it all, and then transform it to (for example) HTML for easy viewing, or JSON for processing by Nagios etc (and because JSON will be a popular output format, you should seriously consider using Python's time module for datetimes and arithmetic, not the datetime module). Each namespace might get its own header and attribute table, plus an extra table for each collection. This is NOT part of the statistics specification; other tools can format how they like.

Turning Collection Off

It is recommended each namespace have an "Enabled" item which, if False, stops collection (but not reporting) of statistical data. Applications SHOULD provide controls to pause and resume collection by setting these entries to False or True, if present.

Usage

    import logging
    # Initialize the repository
    if not hasattr(logging, 'statistics'): logging.statistics = {}
    # Initialize my namespace
    mystats = logging.statistics.setdefault('My Stuff', {})
    # Initialize my namespace's scalars and collections
    mystats.update({
        'Enabled': True,
        'Start Time': time.time(),
        'Important Events': 0,
        'Events/Second': lambda s: (
            (s['Important Events'] / (time.time() - s['Start Time']))),
        })
    ...
    for event in events:
        ...
        # Collect stats
        if mystats.get('Enabled', False):
            mystats['Important Events'] += 1

Original post blogged on b2evolution.

19 Nov 2010 7:08am GMT

12 Nov 2010

feedPlanet CherryPy

Kevin Dangoor: Paver is now on GitHub, thanks to Almad

Paver, the project scripting tool for Python, has just moved to GitHub thanks to Almad. Almad has stepped forward and offered to properly bring Paver into the second decade of the 21st century (doesn't have the same ring to it as bringing something into the 21st century, does it? :)

Seriously, though, Paver reached the point where it was good enough for me and did what I wanted (and, apparently, a good number of other people wanted as well). Almad has some thoughts and where the project should go next and I'm looking forward to hearing more about them. Sign up for the googlegroup to see where Paver is going next.

12 Nov 2010 3:11am GMT

09 Nov 2010

feedPlanet CherryPy

Kevin Dangoor: Paver: project that works, has users, needs a leader

Paver is a Python project scripting tool that I initially created in 2007 to automate a whole bunch of tasks around projects that I was working on. It knows about setuptools and distutils, it has some ideas on handling documentation with example code. It also has users who occasionally like to send in patches. The latest release has had more than 3700 downloads on PyPI.

Paver hasn't needed a lot of work, because it does what it says on the tin: helps you automate project tasks. Sure, there's always more that one could do. But, there isn't more that's required for it to be a useful tool, day-to-day.

Here's the point of my post: Paver is in danger of being abandoned. At this point, everything significant that I am doing is in JavaScript, not Python. The email and patch traffic is low, but it's still too much for someone that's not even actively using the tool any more.

If you're a Paver user and either:

1. want to take the project in fanciful new directions or,

2. want to keep the project humming along with a new .x release every now and then

please let me know.

09 Nov 2010 7:44pm GMT