01 Mar 2021

feedPlanet Python

John Ludhi/nbshare.io: Opinion Mining Aspect Level Sentiment Analysis

Opinion Mining - Aspect Level Sentiment Analysis

Aspect level sentiment analysis employs multiple machine learning processes. The first is parsing the sentence to extract the relation between words and be able to identify the aspects of a review. The second is analysing the sentiment of the adjectives used to describe the aspects.

This can be done automatically using Azure's Text Analytics service. All we need to do is to create a free account on microsoft azure and create a text analytics service: link

  1. Once you create and login to your account go to azure portal.
  2. Search for Text Analytics and create a new service.
  3. It will ask for a resource group, click on "create new"
  4. Choose the free tier which works fine for personal experimentation.
  5. Once the service is created, go to your resources and look for Keys and Endpoints, copy the keys and put them in the following cell.
In [1]:
KEY = "PUT THE KEY HERE"
ENDPOINT = "PUT THE ENDPOINT HERE"

This function is just header to authenticate your credentials and connect with Azure. We can the communicate with the Azure ML service through the client object.

In [ ]:
from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential


def authenticate_client():
    ta_credential = AzureKeyCredential(KEY)
    text_analytics_client = TextAnalyticsClient(
        endpoint=ENDPOINT,
        credential=ta_credential)
    return text_analytics_client


client = authenticate_client() # we will interact with Azure ML via this object.

We will use Jupyter's widgets to create an interactive tool for opinion mining.

In [2]:
import ipywidgets as widgets

We will use Plotly library for interactive visualizations.

In [ ]:
import plotly.graph_objs as go
from plotly.offline import init_notebook_mode
from plotly.subplots import make_subplots
init_notebook_mode() # this line is required to be able to export the notebook as html with the plots.
In [4]:
# given three score (positive - neutral - negative) this function plots a pie chart of the three sentiments
def plot_sentiment_scores(pos, neut, neg):
    return go.Figure(go.Pie(labels=["Positive", "Neutral", "Negative"], values=[pos, neut, neg],
                            textinfo='label+percent',
                            marker=dict(colors=["#2BAE66FF", "#795750", "#C70039"])),
                     layout=dict(showlegend=False)
                     )

Sentiment Analysis using Azure's Text Analytics

Azure's Text analytics analyzes documents, not just sentences. Each document is a list of sentences. So our input must be a list of sentences.

We can use our Azure client to call the analyze_sentiment method, which will return a list of sentiment scores for each passed document. Since we are just using one document with one sentence, we are interested in the first thing it returns, which is a tuple of three values: positive, negative, and neutral sentiment scores.

In [ ]:
response = client.analyze_sentiment(documents=["This movie is fantastic"])
response
In [ ]:
response[0]

AnalyzeSentimentResult(id=0, sentiment=positive, warnings=[], statistics=None, confidence_scores=SentimentConfidenceScores(positive=1.0, neutral=0.0, negative=0.0), sentences=[SentenceSentiment(text=This movie is fantastic, sentiment=positive, confidence_scores=SentimentConfidenceScores(positive=1.0, neutral=0.0, negative=0.0), offset=0, mined_opinions=[])], is_error=False)

In [ ]:
print(f"Positive: {response[0].confidence_scores.positive}")
print(f"Neutral: {response[0].confidence_scores.neutral}")
print(f"Negative: {response[0].confidence_scores.negative}")

Positive: 1.0
Neutral: 0.0
Negative: 0.0

Let's put all of this in a function that takes a list of sentences as an input and plots the distribution of sentiment scores as a pie chart!

In [ ]:
def sentiment_analysis_example(sentences):
    document = [sentences] # we use only one document for this function
    response = client.analyze_sentiment(documents=document)[0] # we use [0] to get only the first and only document
    print("Document Sentiment: {}".format(response.sentiment))
    plot_sentiment_scores(response.confidence_scores.positive,
                          response.confidence_scores.neutral,
                          response.confidence_scores.negative
                         ).show()
    
    
    # here we plot the sentiment for each sentence in the document.
    for idx, sentence in enumerate(response.sentences):
        print("Sentence: {}".format(sentence.text))
        print("Sentence {} sentiment: {}".format(idx+1, sentence.sentiment))
        plot_sentiment_scores(sentence.confidence_scores.positive,
                          sentence.confidence_scores.neutral,
                          sentence.confidence_scores.negative
                         ).show()
In [ ]:
sentiment_analysis_example("The acting was good. The graphics however were just okayish. I did not like the ending though.")

Document Sentiment: mixed

Sentence: The acting was good.
Sentence 1 sentiment: positive

Sentence: The graphics however were just okayish.
Sentence 2 sentiment: negative

Sentence: I did not like the ending though.
Sentence 3 sentiment: negative

Aspect Level Opinion Mining Using Azure Text Analytics

Instead of just reporting the overall sentiment of a sentence, in aspect-level opinion mining, there are two main differences:

  1. We extract specific aspects in the sentences.
  2. We detect the opinion about the aspect in the text, not just a sentiment score.
In [ ]:
repsonse = client.analyze_sentiment(
    ["The food and service were unacceptable and meh, but the concierge were nice and ok"],
    show_opinion_mining=True # only addition is that we set `show_opinion_mining` to True
)[0]
In [ ]:
# now we can also access the mined_opinions in a sentence
mined_opinion = repsonse.sentences[0].mined_opinions[0]
aspect = mined_opinion.aspect
print(f"Aspect: {aspect.text}")
for opinion in mined_opinion.opinions:
    print(f"Opinion: {opinion.text}\tSentiment:{opinion.sentiment}".expandtabs(12))
    # p.s. we use expandtabs because unacceptable is longer than 8 characters
    # , so we want the \t to consider it one long word

Aspect: food Opinion:
unacceptable Sentiment:negative
Opinion: meh Sentiment:mixed

Let's make this more visual

In [ ]:
def plot_sentiment_gauge(pos_score, title, domain=[0, 1]):
    fig = go.Figure(go.Indicator(
        mode="gauge+number",
        value=pos_score,
        gauge={'axis': {'range': [0, 1]}},
        domain={'x': domain, 'y': [0, 1]},
        title={'text': f"{title}", "font":dict(size=14)}), layout=dict(width=800, height=600, margin=dict(l=150,r=150)))
    return fig
In [ ]:
def sentiment_analysis_with_opinion_mining_example(sentences,
                                                   document_level=True,
                                                   sentence_level=True,
                                                   aspect_level=True,
                                                   opinion_level=True):

    document = [sentences]

    response = client.analyze_sentiment(document, show_opinion_mining=True)[0]

    if document_level:  # plotting overall document sentiment
        print("Document Sentiment: {}".format(response.sentiment))
        plot_sentiment_scores(response.confidence_scores.positive,
                              response.confidence_scores.neutral,
                              response.confidence_scores.negative
                              ).show()
    if not(sentence_level or aspect_level or opinion_level):
        # no need to continue if no plots are needed
        return response
    
    for sentence in response.sentences:
        if sentence_level:  # plotting the overall sentence sentiment
            print(f"Sentence: {sentence.text}")
            print(f"Sentence sentiment: {sentence.sentiment}")
            plot_sentiment_scores(
                sentence.confidence_scores.positive,
                sentence.confidence_scores.neutral,
                sentence.confidence_scores.negative).show()

        for mined_opinion in sentence.mined_opinions:
            aspect = mined_opinion.aspect

            if aspect_level:  # plotting the sentiment of the aspect
                plot_sentiment_gauge(
                    aspect.confidence_scores.positive, f"Aspect ({aspect.text})").show()

            if opinion_level:
                opinions = mined_opinion.opinions
                n = len(opinions)
                gauges = list()
                for i, opinion in enumerate(opinions, start=1):
                    gauges.append(plot_sentiment_gauge(
                        opinion.confidence_scores.positive, f"Opinion ({opinion.text})",
                        # this is just to show the plots next to each other
                        domain=[(i-1)/n, i/n]
                    ).data[0])

                go.Figure(gauges, layout=go.Layout(
                    height=600, width=800, autosize=False)).show()
    return response
In [ ]:
response = sentiment_analysis_with_opinion_mining_example(
    "The food and service were unacceptable and meh, but the concierge were nice and ok",
    document_level=False, sentence_level=False
)






Text Analytics using Jupyter widgets

Now let's create some jupyter widgets to interact with this function.

In [ ]:
# some text to get the input
text = widgets.Textarea(placeholder="Enter your text here")
# checkboxes to select different levels of analysis
document_cb = widgets.Checkbox(value=True, description="Document Level")
sentence_cb = widgets.Checkbox(value=True, description="Sentence Level")
aspect_cb = widgets.Checkbox(value=True, description="Aspect Level")
opinion_cb = widgets.Checkbox(value=True, description="Opinion Level")

# some button to trigger the analysis
btn = widgets.Button(description="Analyse")

# some place to show the output on
out = widgets.Output()

def analysis(b):
    with out:
        out.clear_output()
        sentences = text.value # get the input sentences from the Textarea widget
        # pass the input sentences to our `sentiment_analysis_example` function
        sentiment_analysis_with_opinion_mining_example(sentences,
                                                       document_level=document_cb.value,
                                                       sentence_level=sentence_cb.value,
                                                       aspect_level=aspect_cb.value,
                                                       opinion_level=opinion_cb.value
                                                      )

btn.on_click(analysis) # bind the button with the `sentiment_analysis` function

# put all widgets together in a tool
checkboxes = widgets.VBox([document_cb, sentence_cb, aspect_cb,opinion_cb])
tool = widgets.VBox([widgets.HBox([text, checkboxes]), btn, out]) 
# give a default value for the text
text.value = "The food and service were unacceptable and meh, but the concierge were nice and ok"
tool

01 Mar 2021 10:41am GMT

John Ludhi/nbshare.io: Opinion Mining Aspect Level Sentiment Analysis

Opinion Mining - Aspect Level Sentiment Analysis

Aspect level sentiment analysis employs multiple machine learning processes. The first is parsing the sentence to extract the relation between words and be able to identify the aspects of a review. The second is analysing the sentiment of the adjectives used to describe the aspects.

This can be done automatically using Azure's Text Analytics service. All we need to do is to create a free account on microsoft azure and create a text analytics service: link

  1. Once you create and login to your account go to azure portal.
  2. Search for Text Analytics and create a new service.
  3. It will ask for a resource group, click on "create new"
  4. Choose the free tier which works fine for personal experimentation.
  5. Once the service is created, go to your resources and look for Keys and Endpoints, copy the keys and put them in the following cell.
In [1]:
KEY = "PUT THE KEY HERE"
ENDPOINT = "PUT THE ENDPOINT HERE"

This function is just header to authenticate your credentials and connect with Azure. We can the communicate with the Azure ML service through the client object.

In [ ]:
from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential


def authenticate_client():
    ta_credential = AzureKeyCredential(KEY)
    text_analytics_client = TextAnalyticsClient(
        endpoint=ENDPOINT,
        credential=ta_credential)
    return text_analytics_client


client = authenticate_client() # we will interact with Azure ML via this object.

We will use Jupyter's widgets to create an interactive tool for opinion mining.

In [2]:
import ipywidgets as widgets

We will use Plotly library for interactive visualizations.

In [ ]:
import plotly.graph_objs as go
from plotly.offline import init_notebook_mode
from plotly.subplots import make_subplots
init_notebook_mode() # this line is required to be able to export the notebook as html with the plots.
In [4]:
# given three score (positive - neutral - negative) this function plots a pie chart of the three sentiments
def plot_sentiment_scores(pos, neut, neg):
    return go.Figure(go.Pie(labels=["Positive", "Neutral", "Negative"], values=[pos, neut, neg],
                            textinfo='label+percent',
                            marker=dict(colors=["#2BAE66FF", "#795750", "#C70039"])),
                     layout=dict(showlegend=False)
                     )

Sentiment Analysis using Azure's Text Analytics

Azure's Text analytics analyzes documents, not just sentences. Each document is a list of sentences. So our input must be a list of sentences.

We can use our Azure client to call the analyze_sentiment method, which will return a list of sentiment scores for each passed document. Since we are just using one document with one sentence, we are interested in the first thing it returns, which is a tuple of three values: positive, negative, and neutral sentiment scores.

In [ ]:
response = client.analyze_sentiment(documents=["This movie is fantastic"])
response
In [ ]:
response[0]

AnalyzeSentimentResult(id=0, sentiment=positive, warnings=[], statistics=None, confidence_scores=SentimentConfidenceScores(positive=1.0, neutral=0.0, negative=0.0), sentences=[SentenceSentiment(text=This movie is fantastic, sentiment=positive, confidence_scores=SentimentConfidenceScores(positive=1.0, neutral=0.0, negative=0.0), offset=0, mined_opinions=[])], is_error=False)

In [ ]:
print(f"Positive: {response[0].confidence_scores.positive}")
print(f"Neutral: {response[0].confidence_scores.neutral}")
print(f"Negative: {response[0].confidence_scores.negative}")

Positive: 1.0
Neutral: 0.0
Negative: 0.0

Let's put all of this in a function that takes a list of sentences as an input and plots the distribution of sentiment scores as a pie chart!

In [ ]:
def sentiment_analysis_example(sentences):
    document = [sentences] # we use only one document for this function
    response = client.analyze_sentiment(documents=document)[0] # we use [0] to get only the first and only document
    print("Document Sentiment: {}".format(response.sentiment))
    plot_sentiment_scores(response.confidence_scores.positive,
                          response.confidence_scores.neutral,
                          response.confidence_scores.negative
                         ).show()
    
    
    # here we plot the sentiment for each sentence in the document.
    for idx, sentence in enumerate(response.sentences):
        print("Sentence: {}".format(sentence.text))
        print("Sentence {} sentiment: {}".format(idx+1, sentence.sentiment))
        plot_sentiment_scores(sentence.confidence_scores.positive,
                          sentence.confidence_scores.neutral,
                          sentence.confidence_scores.negative
                         ).show()
In [ ]:
sentiment_analysis_example("The acting was good. The graphics however were just okayish. I did not like the ending though.")

Document Sentiment: mixed

Sentence: The acting was good.
Sentence 1 sentiment: positive

Sentence: The graphics however were just okayish.
Sentence 2 sentiment: negative

Sentence: I did not like the ending though.
Sentence 3 sentiment: negative

Aspect Level Opinion Mining Using Azure Text Analytics

Instead of just reporting the overall sentiment of a sentence, in aspect-level opinion mining, there are two main differences:

  1. We extract specific aspects in the sentences.
  2. We detect the opinion about the aspect in the text, not just a sentiment score.
In [ ]:
repsonse = client.analyze_sentiment(
    ["The food and service were unacceptable and meh, but the concierge were nice and ok"],
    show_opinion_mining=True # only addition is that we set `show_opinion_mining` to True
)[0]
In [ ]:
# now we can also access the mined_opinions in a sentence
mined_opinion = repsonse.sentences[0].mined_opinions[0]
aspect = mined_opinion.aspect
print(f"Aspect: {aspect.text}")
for opinion in mined_opinion.opinions:
    print(f"Opinion: {opinion.text}\tSentiment:{opinion.sentiment}".expandtabs(12))
    # p.s. we use expandtabs because unacceptable is longer than 8 characters
    # , so we want the \t to consider it one long word

Aspect: food Opinion:
unacceptable Sentiment:negative
Opinion: meh Sentiment:mixed

Let's make this more visual

In [ ]:
def plot_sentiment_gauge(pos_score, title, domain=[0, 1]):
    fig = go.Figure(go.Indicator(
        mode="gauge+number",
        value=pos_score,
        gauge={'axis': {'range': [0, 1]}},
        domain={'x': domain, 'y': [0, 1]},
        title={'text': f"{title}", "font":dict(size=14)}), layout=dict(width=800, height=600, margin=dict(l=150,r=150)))
    return fig
In [ ]:
def sentiment_analysis_with_opinion_mining_example(sentences,
                                                   document_level=True,
                                                   sentence_level=True,
                                                   aspect_level=True,
                                                   opinion_level=True):

    document = [sentences]

    response = client.analyze_sentiment(document, show_opinion_mining=True)[0]

    if document_level:  # plotting overall document sentiment
        print("Document Sentiment: {}".format(response.sentiment))
        plot_sentiment_scores(response.confidence_scores.positive,
                              response.confidence_scores.neutral,
                              response.confidence_scores.negative
                              ).show()
    if not(sentence_level or aspect_level or opinion_level):
        # no need to continue if no plots are needed
        return response
    
    for sentence in response.sentences:
        if sentence_level:  # plotting the overall sentence sentiment
            print(f"Sentence: {sentence.text}")
            print(f"Sentence sentiment: {sentence.sentiment}")
            plot_sentiment_scores(
                sentence.confidence_scores.positive,
                sentence.confidence_scores.neutral,
                sentence.confidence_scores.negative).show()

        for mined_opinion in sentence.mined_opinions:
            aspect = mined_opinion.aspect

            if aspect_level:  # plotting the sentiment of the aspect
                plot_sentiment_gauge(
                    aspect.confidence_scores.positive, f"Aspect ({aspect.text})").show()

            if opinion_level:
                opinions = mined_opinion.opinions
                n = len(opinions)
                gauges = list()
                for i, opinion in enumerate(opinions, start=1):
                    gauges.append(plot_sentiment_gauge(
                        opinion.confidence_scores.positive, f"Opinion ({opinion.text})",
                        # this is just to show the plots next to each other
                        domain=[(i-1)/n, i/n]
                    ).data[0])

                go.Figure(gauges, layout=go.Layout(
                    height=600, width=800, autosize=False)).show()
    return response
In [ ]:
response = sentiment_analysis_with_opinion_mining_example(
    "The food and service were unacceptable and meh, but the concierge were nice and ok",
    document_level=False, sentence_level=False
)






Text Analytics using Jupyter widgets

Now let's create some jupyter widgets to interact with this function.

In [ ]:
# some text to get the input
text = widgets.Textarea(placeholder="Enter your text here")
# checkboxes to select different levels of analysis
document_cb = widgets.Checkbox(value=True, description="Document Level")
sentence_cb = widgets.Checkbox(value=True, description="Sentence Level")
aspect_cb = widgets.Checkbox(value=True, description="Aspect Level")
opinion_cb = widgets.Checkbox(value=True, description="Opinion Level")

# some button to trigger the analysis
btn = widgets.Button(description="Analyse")

# some place to show the output on
out = widgets.Output()

def analysis(b):
    with out:
        out.clear_output()
        sentences = text.value # get the input sentences from the Textarea widget
        # pass the input sentences to our `sentiment_analysis_example` function
        sentiment_analysis_with_opinion_mining_example(sentences,
                                                       document_level=document_cb.value,
                                                       sentence_level=sentence_cb.value,
                                                       aspect_level=aspect_cb.value,
                                                       opinion_level=opinion_cb.value
                                                      )

btn.on_click(analysis) # bind the button with the `sentiment_analysis` function

# put all widgets together in a tool
checkboxes = widgets.VBox([document_cb, sentence_cb, aspect_cb,opinion_cb])
tool = widgets.VBox([widgets.HBox([text, checkboxes]), btn, out]) 
# give a default value for the text
text.value = "The food and service were unacceptable and meh, but the concierge were nice and ok"
tool

01 Mar 2021 10:41am GMT

Zero to Mastery: Python Monthly 💻🐍 February 2021

15th issue of Python Monthly! Read by 20,000+ Python developers every month. This monthly Python newsletter is focused on keeping you up to date with the industry and keeping your skills sharp, without wasting your valuable time.

01 Mar 2021 10:00am GMT

Zero to Mastery: Python Monthly 💻🐍 February 2021

15th issue of Python Monthly! Read by 20,000+ Python developers every month. This monthly Python newsletter is focused on keeping you up to date with the industry and keeping your skills sharp, without wasting your valuable time.

01 Mar 2021 10:00am GMT

Tryton News: Newsletter for March 2021

Five Bulb Lights
Five Bulb Lights1280×853 97 KB

Here's a sneak peak at the improvements that landed during the last month.

Changes for the User

We now show the carrier on the shipment list so it's possible to prioritize shipments based on the carrier.

We've added a wizard to make it easy to add lots to stock moves. The sequence to use for the lot number can be configured for each product.

We ensure the unit prices for stock moves are up to date when their invoices are posted or their moves are done.

The account move lines created by a statement now have the statement line as their origin. This makes it simpler to audit the accounts.

We now use the menu path from which a window was opened as its name.

We now warn the user when they try to post a statement with cancelled or paid invoices and then remove them from the statement.

A delivery usage checkbox has been added to contact mechanisms just like for addresses. It can be used, for example, to indicate which email address to send notifications related to deliveries.

The clients now display the revision on the dialog. This is useful, for example, when opening the party dialog from the invoice when the history is activated. This way the user can see from which date the information is displayed.

It is easy to get lost when quickly opening consecutive dialog fields. To improve the situation, the clients now display breadcrumbs in the title showing the browsing path to the dialog.

We've added the new identifiers from python-stdnum 1.15.

We no longer create accounting moves for stock when the amount involved is 0.

There is now a scheduled task that can be configured to fetch currency rates at a specific frequency. By default it gets the rates from the European Central Bank.

New Modules

Changes for the System Administrator

We've added device cookie support to the clients. This allows these clients to not be affected by the brute force attack protection.

Changes for the Developer

It is now possible to send emails with different "FROM" addresses for the envelope and header.

All the warnings can be skipped automatically by adding a single key named _skip_warnings to the context.

We've added the trigonometric functions to the SQLite back-end.

Any fields that are loaded eagerly are no longer instantiated automatically but instead the id is just stored in the cache. The instantiation is done only if the field is actually accessed. This improves the performance of some operations by up to 13%, but the actual improvements you can expect will depend a lot on of the number of fields the model has.

It is now possible to define help text for each selection value. However, at the moment only the web client can display it.

We made the ModelView.parse_view method public. This allows the XML that makes up the view to be modified by code before it is sent to the client.

It is now possible to group the report renderings by header. As the OpenDocument format only supports a single header and footer definition, this feature renders a different file for each header and places them in a zip file if needed. This is used when rendering company related reports which display the company information in the header/footer.

In order to simplify the dependencies in our web client, we replaced tempusdominus with the browser's native input methods for types date, datetime-local and time when available.

In order to make better use of the browse cache, the getter method of Function fields is called with cache sized groups of records.

1 post - 1 participant

Read full topic

01 Mar 2021 9:00am GMT

Tryton News: Newsletter for March 2021

Five Bulb Lights
Five Bulb Lights1280×853 97 KB

Here's a sneak peak at the improvements that landed during the last month.

Changes for the User

We now show the carrier on the shipment list so it's possible to prioritize shipments based on the carrier.

We've added a wizard to make it easy to add lots to stock moves. The sequence to use for the lot number can be configured for each product.

We ensure the unit prices for stock moves are up to date when their invoices are posted or their moves are done.

The account move lines created by a statement now have the statement line as their origin. This makes it simpler to audit the accounts.

We now use the menu path from which a window was opened as its name.

We now warn the user when they try to post a statement with cancelled or paid invoices and then remove them from the statement.

A delivery usage checkbox has been added to contact mechanisms just like for addresses. It can be used, for example, to indicate which email address to send notifications related to deliveries.

The clients now display the revision on the dialog. This is useful, for example, when opening the party dialog from the invoice when the history is activated. This way the user can see from which date the information is displayed.

It is easy to get lost when quickly opening consecutive dialog fields. To improve the situation, the clients now display breadcrumbs in the title showing the browsing path to the dialog.

We've added the new identifiers from python-stdnum 1.15.

We no longer create accounting moves for stock when the amount involved is 0.

There is now a scheduled task that can be configured to fetch currency rates at a specific frequency. By default it gets the rates from the European Central Bank.

New Modules

Changes for the System Administrator

We've added device cookie support to the clients. This allows these clients to not be affected by the brute force attack protection.

Changes for the Developer

It is now possible to send emails with different "FROM" addresses for the envelope and header.

All the warnings can be skipped automatically by adding a single key named _skip_warnings to the context.

We've added the trigonometric functions to the SQLite back-end.

Any fields that are loaded eagerly are no longer instantiated automatically but instead the id is just stored in the cache. The instantiation is done only if the field is actually accessed. This improves the performance of some operations by up to 13%, but the actual improvements you can expect will depend a lot on of the number of fields the model has.

It is now possible to define help text for each selection value. However, at the moment only the web client can display it.

We made the ModelView.parse_view method public. This allows the XML that makes up the view to be modified by code before it is sent to the client.

It is now possible to group the report renderings by header. As the OpenDocument format only supports a single header and footer definition, this feature renders a different file for each header and places them in a zip file if needed. This is used when rendering company related reports which display the company information in the header/footer.

In order to simplify the dependencies in our web client, we replaced tempusdominus with the browser's native input methods for types date, datetime-local and time when available.

In order to make better use of the browse cache, the getter method of Function fields is called with cache sized groups of records.

1 post - 1 participant

Read full topic

01 Mar 2021 9:00am GMT

Mike Driscoll: PyDev of the Week: Jonathan Hoffstadt

This week we welcome Jonathan Hoffstadt (@jhoffs1) as our PyDev of the Week! Jonathan is the co-author of Dear PyGUI. It's a neat, new Python GUI package. You can see what else Jonathan has been working on over on Github.

Let's spend some time getting to know Jonathan better!

Can you tell us a little about yourself (hobbies, education, etc):

I'm a mechanical engineer based in Houston, Texas. I have a bachelor's degree in Mechanical Engineering from Louisiana State University, was a Tow Gunner in the U.S. Marines, and I've been working in the oil and gas industry since I graduated university.

My hobbies include chess, shooting, and programming. With programming, I find 3D graphics to be extremely interesting.

Why did you start using Python?

I'd been interested in programming since middle school after I was given a C++ for dummies book as a gift, but I did not encounter Python until university. It was there that I started using Python as a free alternative to MATLAB for assignments. It wasn't long before I was hooked on the language.

I started using it outside of homework for anything and everything I could. This included making small games, automating tasks at internships, controlling breadboards with raspberry pi's, and everything in between. When compared to other languages, I was amazed at how quickly you could make things happen.

I ended up using Python for courses in Finite Element Analysis and Computational Fluid Dynamics. For our senior design capstone project, my team was tasked with building an Arc Welding 3D printer. As the member with the most exposure to programming, I was responsible for the software side of the project in which I used Python to control all the mechanical devices including a robotic arm and custom electronics the team created. I also wrote my first user interface which used tkinter and pygame to wrap an open source slicing engine and provide a 3D view of tool paths and the robotic arm position.

What other programming languages do you know, and which is your favorite?

C, C++, and Java are my other primary languages, though I've worked with C#, Swift, and Objective-C.

The truth is that I have 2 favorite languages. C++ for large projects. Python for small projects, scripting, and just getting things done!

What projects are you working on now?

I currently spend most of my time working on Dear PyGui.

Which Python libraries are your favorite (core or 3rd party)?

My favorite Python libraries would have to be NumPy, Pillow, tqdm, json, and Nuitka.

How did your package, Dear PyGUI, come about?

Dear PyGui is a graphical user interface library I coauthored with my friend, Preston Cothren. As mechanical engineers, we use python daily for calculations, scripts, and plugins for various software used in mechanical design and analysis. We wanted an easy way to add an interface to the various scripts with minimal effort.

The first iteration of the software was called "Engineer's Sandbox" which was commercial. Not only was it easy to create small interfaces but it also made it easy to package and distribute. It came with a built in IDE and 60 premade apps. "Sandbox" was a C++ program that embedded python into it where the graphics were created with wxWidgets. Ultimately, this project was unsuccessful, gaining only a few hundred users. You can see an image of it below:

Engineer's Sandbox (precursor to Dear PyGUI)

Six months after abandoning Engineer's Sandbox, we revisited the idea and reassessed. We came to 3 realizations:

1. Our primary target audience (mechanical engineers) were mostly uninterested in programming.

2. The software was too restrictive and limited for developers (our second target audience). Limited widgets, layouts, limited 3rd party operability, etc.

3. Most developers prefer using open source libraries.

From these realizations, we went back to the drawing board and decided to make a full GUI library. With this iteration being open source, a Python extension (instead of a standalone software), and as easy to use as possible.

Between iterations, we fell in love with the C++ library, Dear ImGui, and so decided to use it as underlying library to build around. With Dear ImGui being an immediate mode GUI library, it allowed us to make Dear PyGui extremely dynamic when compared to other UI libraries.

Dear PyGui has continued to rapidly improve and grow in popularity since we released the first open beta in July of 2020:

Dear PyGUI Examples

What are the top three things you've learned as an open-source developer?

As an open-source developer, I've learned that:

1. It's hard work and will make you appreciate open-source software and developers.

2. Listen to the community but also know when to say "no".

3. Funding is difficult to find, so you should enjoy the work you are doing.

Is there anything else you'd like to say?

Yes! This is for those new to programming. I'm often asked how to learn a programming language, library, topic, etc. and my answer has always been: The best way to learn anything in programming is to just start building things.

I typically skim a book then immediately start trying to build something. As I get stuck, I go back to the book to read the relevant sections more closely. Once it's time to refactor and optimize, I typically go back to the book and read the more advanced sections now that I'm more aware of the issues that the advanced sections try to address. I've found this technique helps me a lot. Although you may end up reinventing the wheel by saving the advanced topics for after you're done, you will end up with a deeper understanding that you are unlikely to forget.

Thanks for doing the interview, Jonathan!

The post PyDev of the Week: Jonathan Hoffstadt appeared first on Mouse Vs Python.

01 Mar 2021 6:05am GMT

Mike Driscoll: PyDev of the Week: Jonathan Hoffstadt

This week we welcome Jonathan Hoffstadt (@jhoffs1) as our PyDev of the Week! Jonathan is the co-author of Dear PyGUI. It's a neat, new Python GUI package. You can see what else Jonathan has been working on over on Github.

Let's spend some time getting to know Jonathan better!

Can you tell us a little about yourself (hobbies, education, etc):

I'm a mechanical engineer based in Houston, Texas. I have a bachelor's degree in Mechanical Engineering from Louisiana State University, was a Tow Gunner in the U.S. Marines, and I've been working in the oil and gas industry since I graduated university.

My hobbies include chess, shooting, and programming. With programming, I find 3D graphics to be extremely interesting.

Why did you start using Python?

I'd been interested in programming since middle school after I was given a C++ for dummies book as a gift, but I did not encounter Python until university. It was there that I started using Python as a free alternative to MATLAB for assignments. It wasn't long before I was hooked on the language.

I started using it outside of homework for anything and everything I could. This included making small games, automating tasks at internships, controlling breadboards with raspberry pi's, and everything in between. When compared to other languages, I was amazed at how quickly you could make things happen.

I ended up using Python for courses in Finite Element Analysis and Computational Fluid Dynamics. For our senior design capstone project, my team was tasked with building an Arc Welding 3D printer. As the member with the most exposure to programming, I was responsible for the software side of the project in which I used Python to control all the mechanical devices including a robotic arm and custom electronics the team created. I also wrote my first user interface which used tkinter and pygame to wrap an open source slicing engine and provide a 3D view of tool paths and the robotic arm position.

What other programming languages do you know, and which is your favorite?

C, C++, and Java are my other primary languages, though I've worked with C#, Swift, and Objective-C.

The truth is that I have 2 favorite languages. C++ for large projects. Python for small projects, scripting, and just getting things done!

What projects are you working on now?

I currently spend most of my time working on Dear PyGui.

Which Python libraries are your favorite (core or 3rd party)?

My favorite Python libraries would have to be NumPy, Pillow, tqdm, json, and Nuitka.

How did your package, Dear PyGUI, come about?

Dear PyGui is a graphical user interface library I coauthored with my friend, Preston Cothren. As mechanical engineers, we use python daily for calculations, scripts, and plugins for various software used in mechanical design and analysis. We wanted an easy way to add an interface to the various scripts with minimal effort.

The first iteration of the software was called "Engineer's Sandbox" which was commercial. Not only was it easy to create small interfaces but it also made it easy to package and distribute. It came with a built in IDE and 60 premade apps. "Sandbox" was a C++ program that embedded python into it where the graphics were created with wxWidgets. Ultimately, this project was unsuccessful, gaining only a few hundred users. You can see an image of it below:

Engineer's Sandbox (precursor to Dear PyGUI)

Six months after abandoning Engineer's Sandbox, we revisited the idea and reassessed. We came to 3 realizations:

1. Our primary target audience (mechanical engineers) were mostly uninterested in programming.

2. The software was too restrictive and limited for developers (our second target audience). Limited widgets, layouts, limited 3rd party operability, etc.

3. Most developers prefer using open source libraries.

From these realizations, we went back to the drawing board and decided to make a full GUI library. With this iteration being open source, a Python extension (instead of a standalone software), and as easy to use as possible.

Between iterations, we fell in love with the C++ library, Dear ImGui, and so decided to use it as underlying library to build around. With Dear ImGui being an immediate mode GUI library, it allowed us to make Dear PyGui extremely dynamic when compared to other UI libraries.

Dear PyGui has continued to rapidly improve and grow in popularity since we released the first open beta in July of 2020:

Dear PyGUI Examples

What are the top three things you've learned as an open-source developer?

As an open-source developer, I've learned that:

1. It's hard work and will make you appreciate open-source software and developers.

2. Listen to the community but also know when to say "no".

3. Funding is difficult to find, so you should enjoy the work you are doing.

Is there anything else you'd like to say?

Yes! This is for those new to programming. I'm often asked how to learn a programming language, library, topic, etc. and my answer has always been: The best way to learn anything in programming is to just start building things.

I typically skim a book then immediately start trying to build something. As I get stuck, I go back to the book to read the relevant sections more closely. Once it's time to refactor and optimize, I typically go back to the book and read the more advanced sections now that I'm more aware of the issues that the advanced sections try to address. I've found this technique helps me a lot. Although you may end up reinventing the wheel by saving the advanced topics for after you're done, you will end up with a deeper understanding that you are unlikely to forget.

Thanks for doing the interview, Jonathan!

The post PyDev of the Week: Jonathan Hoffstadt appeared first on Mouse Vs Python.

01 Mar 2021 6:05am GMT

28 Feb 2021

feedPlanet Python

Matthew Wright: Profiling Python code with line_profiler

Once we have debugged, working, readable (and hopefully testable) code, it may become important to examine it more closely and try to improve the code's performance. Before we can make any progress in determining if our changes are an improvement, we need to measure the current performance and see where it is spending its time. … Continue reading Profiling Python code with line_profiler

The post Profiling Python code with line_profiler appeared first on wrighters.io.

28 Feb 2021 11:44pm GMT

Matthew Wright: Profiling Python code with line_profiler

Once we have debugged, working, readable (and hopefully testable) code, it may become important to examine it more closely and try to improve the code's performance. Before we can make any progress in determining if our changes are an improvement, we need to measure the current performance and see where it is spending its time. … Continue reading Profiling Python code with line_profiler

The post Profiling Python code with line_profiler appeared first on wrighters.io.

28 Feb 2021 11:44pm GMT

Test and Code: 146: Automation Tools for Web App and API Development and Maintenance - Michael Kennedy

Building any software, including web apps and APIs requires testing.
There's automated testing, and there's manual testing.

In between that is exploratory testing aided by automation tools.

Michael Kennedy joins the show this week to share some of the tools he uses during development and maintenance.

We talk about tools used for semi-automated exploratory testing.
We also talk about some of the other tools and techniques he uses to keep Talk Python Training, Talk Python, and Python Bytes all up and running smoothly.

We talk about:

Special Guest: Michael Kennedy.

Sponsored By:

Support Test & Code : Python Testing

Links:

<p>Building any software, including web apps and APIs requires testing.<br> There&#39;s automated testing, and there&#39;s manual testing.<br><br> In between that is exploratory testing aided by automation tools. </p> <p>Michael Kennedy joins the show this week to share some of the tools he uses during development and maintenance.</p> <p>We talk about tools used for semi-automated exploratory testing. <br> We also talk about some of the other tools and techniques he uses to keep Talk Python Training, Talk Python, and Python Bytes all up and running smoothly. </p> <p>We talk about:</p> <ul> <li>Postman</li> <li>ngrok</li> <li>sitemap link testing</li> <li>scripts for manual processes</li> <li>using failover servers during maintenance, redeployments, etc</li> <li>gitHub webhooks and scripts to between fail over servers and production during deployments automatically</li> <li>floating IP addresses </li> <li>services to monitor your site: StatusCake, BetterUptime</li> <li>the affect of monitoring on analytics</li> <li>crash reporting: Rollbar, Sentry</li> <li>response times</li> <li>load testing: Locus</li> </ul><p>Special Guest: Michael Kennedy.</p><p>Sponsored By:</p><ul><li><a href="https://linode.com/testandcode" rel="nofollow">Linode</a>: <a href="https://linode.com/testandcode" rel="nofollow">If it runs on Linux, it runs on Linode. Get started on Linode today with $100 in free credit for listeners of Test & Code.</a></li></ul><p><a href="https://www.patreon.com/testpodcast" rel="payment">Support Test & Code : Python Testing</a></p><p>Links:</p><ul><li><a href="https://pythonbytes.fm/" title="Python Bytes Podcast" rel="nofollow">Python Bytes Podcast</a></li><li><a href="https://talkpython.fm/" title="Talk Python To Me Podcast" rel="nofollow">Talk Python To Me Podcast</a></li><li><a href="https://training.talkpython.fm/" title="Talk Python Training" rel="nofollow">Talk Python Training</a></li><li><a href="https://www.postman.com/" title="Postman" rel="nofollow">Postman</a></li><li><a href="https://ngrok.com/" title="ngrok" rel="nofollow">ngrok</a></li><li><a href="https://www.statuscake.com/" title="StatusCake" rel="nofollow">StatusCake</a></li><li><a href="https://betteruptime.com/" title="Better Uptime" rel="nofollow">Better Uptime</a></li><li><a href="https://rollbar.com/free-trial/?obility_id=104442591522&utm_source=google&utm_medium=cpc&utm_campaign=Brand&utm_term=rollbar&utm_content=442667882877&hsa_acc=8602228161&hsa_cam=10324648206&hsa_grp=104442591522&hsa_ad=442667882877&hsa_src=g&hsa_tgt=kwd-780380991&hsa_kw=rollbar&hsa_mt=e&hsa_net=adwords&hsa_ver=3&gclid=Cj0KCQiA7NKBBhDBARIsAHbXCB4WfJXsrUh4i3hrFD6JX5I96uIJSn55kDXDV3cujqnUoquHBwyqRcYaAhgOEALw_wcB" title="Rollbar" rel="nofollow">Rollbar</a></li><li><a href="https://sentry.io" title="Sentry" rel="nofollow">Sentry</a></li><li><a href="https://locust.io/" title="Locust" rel="nofollow">Locust</a></li><li><a href="https://suade.org/dev/12-requests-per-second-with-python/" title="12 requests per second in Python" rel="nofollow">12 requests per second in Python</a></li></ul>

28 Feb 2021 11:00pm GMT

Test and Code: 146: Automation Tools for Web App and API Development and Maintenance - Michael Kennedy

Building any software, including web apps and APIs requires testing.
There's automated testing, and there's manual testing.

In between that is exploratory testing aided by automation tools.

Michael Kennedy joins the show this week to share some of the tools he uses during development and maintenance.

We talk about tools used for semi-automated exploratory testing.
We also talk about some of the other tools and techniques he uses to keep Talk Python Training, Talk Python, and Python Bytes all up and running smoothly.

We talk about:

Special Guest: Michael Kennedy.

Sponsored By:

Support Test & Code : Python Testing

Links:

<p>Building any software, including web apps and APIs requires testing.<br> There&#39;s automated testing, and there&#39;s manual testing.<br><br> In between that is exploratory testing aided by automation tools. </p> <p>Michael Kennedy joins the show this week to share some of the tools he uses during development and maintenance.</p> <p>We talk about tools used for semi-automated exploratory testing. <br> We also talk about some of the other tools and techniques he uses to keep Talk Python Training, Talk Python, and Python Bytes all up and running smoothly. </p> <p>We talk about:</p> <ul> <li>Postman</li> <li>ngrok</li> <li>sitemap link testing</li> <li>scripts for manual processes</li> <li>using failover servers during maintenance, redeployments, etc</li> <li>gitHub webhooks and scripts to between fail over servers and production during deployments automatically</li> <li>floating IP addresses </li> <li>services to monitor your site: StatusCake, BetterUptime</li> <li>the affect of monitoring on analytics</li> <li>crash reporting: Rollbar, Sentry</li> <li>response times</li> <li>load testing: Locus</li> </ul><p>Special Guest: Michael Kennedy.</p><p>Sponsored By:</p><ul><li><a href="https://linode.com/testandcode" rel="nofollow">Linode</a>: <a href="https://linode.com/testandcode" rel="nofollow">If it runs on Linux, it runs on Linode. Get started on Linode today with $100 in free credit for listeners of Test & Code.</a></li></ul><p><a href="https://www.patreon.com/testpodcast" rel="payment">Support Test & Code : Python Testing</a></p><p>Links:</p><ul><li><a href="https://pythonbytes.fm/" title="Python Bytes Podcast" rel="nofollow">Python Bytes Podcast</a></li><li><a href="https://talkpython.fm/" title="Talk Python To Me Podcast" rel="nofollow">Talk Python To Me Podcast</a></li><li><a href="https://training.talkpython.fm/" title="Talk Python Training" rel="nofollow">Talk Python Training</a></li><li><a href="https://www.postman.com/" title="Postman" rel="nofollow">Postman</a></li><li><a href="https://ngrok.com/" title="ngrok" rel="nofollow">ngrok</a></li><li><a href="https://www.statuscake.com/" title="StatusCake" rel="nofollow">StatusCake</a></li><li><a href="https://betteruptime.com/" title="Better Uptime" rel="nofollow">Better Uptime</a></li><li><a href="https://rollbar.com/free-trial/?obility_id=104442591522&utm_source=google&utm_medium=cpc&utm_campaign=Brand&utm_term=rollbar&utm_content=442667882877&hsa_acc=8602228161&hsa_cam=10324648206&hsa_grp=104442591522&hsa_ad=442667882877&hsa_src=g&hsa_tgt=kwd-780380991&hsa_kw=rollbar&hsa_mt=e&hsa_net=adwords&hsa_ver=3&gclid=Cj0KCQiA7NKBBhDBARIsAHbXCB4WfJXsrUh4i3hrFD6JX5I96uIJSn55kDXDV3cujqnUoquHBwyqRcYaAhgOEALw_wcB" title="Rollbar" rel="nofollow">Rollbar</a></li><li><a href="https://sentry.io" title="Sentry" rel="nofollow">Sentry</a></li><li><a href="https://locust.io/" title="Locust" rel="nofollow">Locust</a></li><li><a href="https://suade.org/dev/12-requests-per-second-with-python/" title="12 requests per second in Python" rel="nofollow">12 requests per second in Python</a></li></ul>

28 Feb 2021 11:00pm GMT

Codementor: Server deployment with Python: From A to Z.

In this tutorial you will learn how to configure a server and deploy a web app from scratch by using only Python.

28 Feb 2021 5:58pm GMT

Codementor: Server deployment with Python: From A to Z.

In this tutorial you will learn how to configure a server and deploy a web app from scratch by using only Python.

28 Feb 2021 5:58pm GMT

27 Feb 2021

feedPlanet Python

The Open Sourcerer: A new data format has landed in the upcoming GTG 0.5

Here's a general call for testing from your favorite pythonic native Linux desktop personal productivity app, GTG.

In recent months, Diego tackled the epic task of redesigning the XML file format from a new specification devised with the help of Brent Saner (proposal episodes 1, 2 and 3), and then implementing the new file format in GTG. This work has now been merged to the main development branch on GTG's git repository:

Diego's changes are major, invasive technological changes, and they would benefit from extensive testing by everybody with "real data" before 0.5 happens (very soon). I've done some pretty extensive testing & bug reporting in the last few months; Diego fixed all the issues I've reported so far, so I've pretty much run out of serious bugs now, as only a few remain targetted to the 0.5 milestone… But I'm only human, and it is possible that issues might remain, even after my troll-testing.

Grab GTG's git version ASAP, with a copy of your real data (for extra caution, and also because we want you to test with real data); see the instructions in the README, including the "Where is my user data and config stored?" section.

Please torture-test it to make sure everything is working properly, and report issues you may find (if any). Look for anything that might seem broken "compared to 0.4", incorrect task parenting/associations, incorrect tagging, broken content, etc.

If you've tried to break it and still couldn't find any problems, maybe one way to indicate that would be a "👍" on the merge request-I'm not sure we really have another way to know if it turns out that "everything is OK" 🙂

Your help in testing this (or spreading the word) will help ensure a smooth transition for users getting an upgrade from 0.4 to 0.5, letting us release 0.5 with confidence. Thanks!

27 Feb 2021 11:53pm GMT

The Open Sourcerer: A new data format has landed in the upcoming GTG 0.5

Here's a general call for testing from your favorite pythonic native Linux desktop personal productivity app, GTG.

In recent months, Diego tackled the epic task of redesigning the XML file format from a new specification devised with the help of Brent Saner (proposal episodes 1, 2 and 3), and then implementing the new file format in GTG. This work has now been merged to the main development branch on GTG's git repository:

Diego's changes are major, invasive technological changes, and they would benefit from extensive testing by everybody with "real data" before 0.5 happens (very soon). I've done some pretty extensive testing & bug reporting in the last few months; Diego fixed all the issues I've reported so far, so I've pretty much run out of serious bugs now, as only a few remain targetted to the 0.5 milestone… But I'm only human, and it is possible that issues might remain, even after my troll-testing.

Grab GTG's git version ASAP, with a copy of your real data (for extra caution, and also because we want you to test with real data); see the instructions in the README, including the "Where is my user data and config stored?" section.

Please torture-test it to make sure everything is working properly, and report issues you may find (if any). Look for anything that might seem broken "compared to 0.4", incorrect task parenting/associations, incorrect tagging, broken content, etc.

If you've tried to break it and still couldn't find any problems, maybe one way to indicate that would be a "👍" on the merge request-I'm not sure we really have another way to know if it turns out that "everything is OK" 🙂

Your help in testing this (or spreading the word) will help ensure a smooth transition for users getting an upgrade from 0.4 to 0.5, letting us release 0.5 with confidence. Thanks!

27 Feb 2021 11:53pm GMT

Weekly Python StackOverflow Report: (cclxv) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2021-02-27 18:35:08 GMT


  1. Why aren't my list elements being swapped? - [11/1]
  2. Title words in a column except certain words - [8/4]
  3. specify number of spaces between pandas DataFrame columns when printing - [8/1]
  4. More effective / clean way to aggregate data - [5/5]
  5. Is there a way to write this if-else using min and max? - [5/4]
  6. What does the operator += return in Python - [5/4]
  7. Find missing numbers in a sorted column in Pandas Dataframe - [5/3]
  8. Perform best cycle sort knowing order at the end - [5/1]
  9. Making a scroll bar but its inconsistent - [5/1]
  10. Python import mechanism and module mocks - [5/1]

27 Feb 2021 8:42pm GMT

Weekly Python StackOverflow Report: (cclxv) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2021-02-27 18:35:08 GMT


  1. Why aren't my list elements being swapped? - [11/1]
  2. Title words in a column except certain words - [8/4]
  3. specify number of spaces between pandas DataFrame columns when printing - [8/1]
  4. More effective / clean way to aggregate data - [5/5]
  5. Is there a way to write this if-else using min and max? - [5/4]
  6. What does the operator += return in Python - [5/4]
  7. Find missing numbers in a sorted column in Pandas Dataframe - [5/3]
  8. Perform best cycle sort knowing order at the end - [5/1]
  9. Making a scroll bar but its inconsistent - [5/1]
  10. Python import mechanism and module mocks - [5/1]

27 Feb 2021 8:42pm GMT

Corey Gallon: 3 Simple Steps to Build a Python Package for Conda Forge

In just 3 easy steps, we'll package the spotipy Python library for Conda Forge!

Hey data hackers! We're all raving fans of Conda Forge - the community-led repository of recipes, build infrastructure and distributions for the conda package manager, right? It's a rich source of the most useful, updated libraries for Python (and many other languages, including R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN … the list goes on). You may well have found, though, that the library you're looking for isn't available in the repo, which is precisely what I found recently when building a machine learning model to predict song popularity, based on its musical attributes. What does one do in such a situation? Package the library yourself, of course! While this may seem a daunting task, we'll work through 3 simple steps to building a Python package for Conda Forge, and submitting it to the repository.

Why build a package for Conda Forge?

The excellent spotipy library, written and maintained by the similarly awesome data scientist and developer Paul Lamere (Director of the Developer Platform at Spotify), is (well … was) not available in Conda Forge. This library is a Python wrapper for the Spotify API. Recently, I wrote a machine learning model to predict a song's popularity based on its musical attributes. The data underlying the model were to be pulled from the Spotify API, using the spotipy library. Unfortunately, at the time I went to build the model, the only Python package repo offering spotipy was PyPI (the repo that 'pip' installs from). You, like me, may have learned the hard way that it is unadvised to mix packages from both pip and conda in a conda environment. In order to get through my little machine learning project, I downloaded the spotipy source code from PyPI and both built and installed it locally. Wanting to write about the project, though, I realized that this approach is suboptimal for most who want to hackalong with the article, hence the decision to package spotipy myself.

"Enough background, already - let's build!"

Rightho! Enough background … let's get started! The good people at Conda Forge provide detailed instructions on how to contribute packages to the repo. We'll distill, from these detailed instructions, 3 simple steps to follow to build and contribute your Python package to Conda Forge.

  1. Fork the Conda Forge examples recipes repository at GitHub
  2. Tailor the example recipe for your package
  3. Submit a pull request to Conda Forge

Let's get into it!

1. Fork the Conda Forge examples repository at GitHub

Before we head over to GitHub, be sure that the source code for your package is available for download in a single file. This should be an archive (e.g. tarball, zip, etc …) of some kind. In this case, as we're packaging spotipy from PyPI, we can confirm that the source code is, indeed, available in a gzipped tarball there. You should confirm, similarly, for the package you plan to build and contribute.

We'll be working in GitHub for this project. If you've never used GitHub before, or need a brief refresher, here's the great Hello World! documentation they offer.

Okay! Pop over to GitHub and open the 'staged-recipes' repository in GitHub Desktop. This will clone the repo to your local machine.

Clone the repo locally by opening it in GitHub Desktop

In GitHub Desktop, within your newly forked repo, create a new branch from the staged-recipes master branch. Name the new branch as is sensible for your package. In this case, I named the branch 'spotipy'. To create a new branch, simply click the 'master' branch button and type the name of the new branch into the 'Filter' text field.

Create a new branch by typing the name of the branch and hitting the 'New branch' button

Now we'll create a new folder in the 'recipes' folder, and copy the example recipe into the new folder. To do this, open the files in the repository in your operating system's file browser.

Open the files of the repo within your operating system's file browser

The newly opened window will look something like this, depending on your operating system (this is Windows 10)

These are the files of the cloned repository

Navigate to the 'recipes' folder (highlighted above) and create a copy of the 'example' folder (CTRL + drag the folder in Windows 10), then rename it to reflect your package name. NB this is an important step - don't just create a folder, copy and rename it so that the example 'meta.yaml' file is copied also.

Copy and rename the 'example' folder to your package name

Within your newly created folder, open the meta.yaml file in your favorite text editor, and …

2. Tailor the example recipe for your package

Conda Forge recipes are written in the YAML (YAML Ain't Markup Language) data serialization language. If you've not written YAML before, or need a brief refresher, here's a great reference. The copy of the example recipe meta.yaml file looks like this

Again, the good people at Conda Forge provide very detailed instructions on how to edit this file - both in the file itself, and in their documentation. Another vital bit of documentation is provided by conda here. Let's save you the hassle of reading through all of this at the start, though I found myself referencing these docs frequently throughout the process and strongly suggest you come back and do the same.

Pro tip: use the Python 'grayskull' package to automatically create a near-perfect meta.yaml file!

I spent ages the first time I did this manually editing the meta.yaml file for my package and iteratively submitting it via a pull request. It turns out that all of that brain damage can be avoided by using the Python packaging tools provided by conda to generate this file. The documentation provided by the Conda team is helpful, though the approach outlined here (i.e. using conda skeleton) did not work for me because, I learned after much banging of my head against the keyboard, conda skeleton needs a major refactoring.

Enter grayskull, which will eventually replace conda skeleton. Grayskull is a tool to generate concise recipes for Conda Forge - precisely the task at hand! We won't go through the process of creating a new conda environment here. Simply install grayskull from conda forge in a new conda environment as follows:

conda install -c conda-forge grayskull

With grayskull installed, use it to build a meta.yaml (recipe) for your package, by calling it, passing it the repository in which the package presently lives, and the name of the package. In our case:

grayskull pypi spotipy

Upon a successful build, you'll see something like this …

grayskull has successfully built a recipe for the spotipy package

… and there will be a newly created folder with the name of your package. Inside this folder, you'll find a near-perfect recipe to inform your tailoring of the meta.yaml file you've copied in your local clone of the repo above.

At this point, you could either copy and paste the file into your clone of the repo, overwriting the example file above, or edit the example file down using this information as inputs. I suggest the latter, as the recipe that grayskull creates isn't quite perfect and will likely be rejected during the pull request process without some edits. Importantly, in this case, Conda Forge requires minimum version limits for Python because we're building to the 'noarch' (i.e. operating system non-specific) target. The edits are simple enough, though. In the 'requirements' section of the YAML file, add minimum versions to the python lines for both host and run.

… annnnnnnd we're donezo! The final, pull request-ready recipe meta.yaml file for the spotipy package is as follows.

Note the subtle differences between this file and the one generated by grayskull. Again, I recommend editing the file using the output of grayskull rather than copying and pasting to avoid potential issues during the pull request process.

3. Submit a pull request to Conda Forge

Rightho! We're almost finished. All that remains is to submit a pull request to the maintainers of Conda Forge. Before we do, we'll need to commit the changes we've made in our local clone of the GitHub.

Commit the changes to the local repository

Now we'll push the commit back to GitHub.

Push the local commit back to GitHub

Excellent! Now it's time to submit our pull request.

They have a really dope continuous integration (CI) pipeline that automates most of the pull request process! A reviewer from the Conda Forge team will, ultimately, review and approve the request, but there is a heap of great feedback from their automated CI that really speeds the process up.

Important note: do not fully submit a pull request for the spotipy package if you've been hacking-along with this article, as it will be rejected as a duplicate (of the package I've already contributed).

When we select "Pull request" from the GitHub desktop app, we're returned to the GitHub website to complete the process. Click the 'Pull request' link at the top right of the repo.

Initiate a pull request to the Conda Forge repo

The next screen shows you compared changes between the original repo we cloned, and the edits we've made.

We can see the changes that were made via commits to the repo

Click the green "Create pull request" button. In the next screen, provide an informative title for the pull request and work carefully through each item of the checklist to confirm that all of the requirements are met.

Provide an informative title for the pull request
Be sure to work through each of the checklist requirements and confirm they are satisfied

Once the pull request is submitted, the aforementioned slick CI process is kicked off and works through 5 automated steps. These include linting, checking the package builds, then checking that the package builds for each of 3 target operating systems (Linux, OSX and Windows). When these checks successfully complete, you'll be notified. (This process can take as much as half an hour or so … be patient!)

Brilliant! We've passed all 5 of the checks that the CI process ran automatically.

Now … we play the waiting game until we hear back from the maintainers with any additional feedback …

In a little while (a few hours, in this case) the maintainers will respond with either feedback or, if all went well, confirmation that the pull request was merged. Congratulations - you are now a package maintainer for Conda Forge! You'll receive an email invitation to the Conda Forge organization on GitHub, which must be accepted within 7 days. After accepting, the automated CI process will build the package so that it is available at Conda Forge. Here's the spotipy package.

Love it! Here's the spotipy package now incorporated in the Conda Forge repo!

We can also run a quick search from the shell to confirm that the package is available to install via conda

conda search -c conda-forge spotipy
spotipy is ready to install via conda!

… and that's it! I hope you've found this both interesting and accelerating as you make your way into the wonderful world of Python packaging for Conda Forge!

What the huh!? Problems I encountered along the way …

This article comes together in a way that suggests graceful ease of development. As usual, that was certainly not the case. Here are a few things I learned by banging my head against the keyboard during the process of packaging spotipy for Conda Forge.

Screencast forthcoming!

Stay tuned for a video walkthrough of this article!

27 Feb 2021 8:18pm GMT

Corey Gallon: 3 Simple Steps to Build a Python Package for Conda Forge

In just 3 easy steps, we'll package the spotipy Python library for Conda Forge!

Hey data hackers! We're all raving fans of Conda Forge - the community-led repository of recipes, build infrastructure and distributions for the conda package manager, right? It's a rich source of the most useful, updated libraries for Python (and many other languages, including R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN … the list goes on). You may well have found, though, that the library you're looking for isn't available in the repo, which is precisely what I found recently when building a machine learning model to predict song popularity, based on its musical attributes. What does one do in such a situation? Package the library yourself, of course! While this may seem a daunting task, we'll work through 3 simple steps to building a Python package for Conda Forge, and submitting it to the repository.

Why build a package for Conda Forge?

The excellent spotipy library, written and maintained by the similarly awesome data scientist and developer Paul Lamere (Director of the Developer Platform at Spotify), is (well … was) not available in Conda Forge. This library is a Python wrapper for the Spotify API. Recently, I wrote a machine learning model to predict a song's popularity based on its musical attributes. The data underlying the model were to be pulled from the Spotify API, using the spotipy library. Unfortunately, at the time I went to build the model, the only Python package repo offering spotipy was PyPI (the repo that 'pip' installs from). You, like me, may have learned the hard way that it is unadvised to mix packages from both pip and conda in a conda environment. In order to get through my little machine learning project, I downloaded the spotipy source code from PyPI and both built and installed it locally. Wanting to write about the project, though, I realized that this approach is suboptimal for most who want to hackalong with the article, hence the decision to package spotipy myself.

"Enough background, already - let's build!"

Rightho! Enough background … let's get started! The good people at Conda Forge provide detailed instructions on how to contribute packages to the repo. We'll distill, from these detailed instructions, 3 simple steps to follow to build and contribute your Python package to Conda Forge.

  1. Fork the Conda Forge examples recipes repository at GitHub
  2. Tailor the example recipe for your package
  3. Submit a pull request to Conda Forge

Let's get into it!

1. Fork the Conda Forge examples repository at GitHub

Before we head over to GitHub, be sure that the source code for your package is available for download in a single file. This should be an archive (e.g. tarball, zip, etc …) of some kind. In this case, as we're packaging spotipy from PyPI, we can confirm that the source code is, indeed, available in a gzipped tarball there. You should confirm, similarly, for the package you plan to build and contribute.

We'll be working in GitHub for this project. If you've never used GitHub before, or need a brief refresher, here's the great Hello World! documentation they offer.

Okay! Pop over to GitHub and open the 'staged-recipes' repository in GitHub Desktop. This will clone the repo to your local machine.

Clone the repo locally by opening it in GitHub Desktop

In GitHub Desktop, within your newly forked repo, create a new branch from the staged-recipes master branch. Name the new branch as is sensible for your package. In this case, I named the branch 'spotipy'. To create a new branch, simply click the 'master' branch button and type the name of the new branch into the 'Filter' text field.

Create a new branch by typing the name of the branch and hitting the 'New branch' button

Now we'll create a new folder in the 'recipes' folder, and copy the example recipe into the new folder. To do this, open the files in the repository in your operating system's file browser.

Open the files of the repo within your operating system's file browser

The newly opened window will look something like this, depending on your operating system (this is Windows 10)

These are the files of the cloned repository

Navigate to the 'recipes' folder (highlighted above) and create a copy of the 'example' folder (CTRL + drag the folder in Windows 10), then rename it to reflect your package name. NB this is an important step - don't just create a folder, copy and rename it so that the example 'meta.yaml' file is copied also.

Copy and rename the 'example' folder to your package name

Within your newly created folder, open the meta.yaml file in your favorite text editor, and …

2. Tailor the example recipe for your package

Conda Forge recipes are written in the YAML (YAML Ain't Markup Language) data serialization language. If you've not written YAML before, or need a brief refresher, here's a great reference. The copy of the example recipe meta.yaml file looks like this

Again, the good people at Conda Forge provide very detailed instructions on how to edit this file - both in the file itself, and in their documentation. Another vital bit of documentation is provided by conda here. Let's save you the hassle of reading through all of this at the start, though I found myself referencing these docs frequently throughout the process and strongly suggest you come back and do the same.

Pro tip: use the Python 'grayskull' package to automatically create a near-perfect meta.yaml file!

I spent ages the first time I did this manually editing the meta.yaml file for my package and iteratively submitting it via a pull request. It turns out that all of that brain damage can be avoided by using the Python packaging tools provided by conda to generate this file. The documentation provided by the Conda team is helpful, though the approach outlined here (i.e. using conda skeleton) did not work for me because, I learned after much banging of my head against the keyboard, conda skeleton needs a major refactoring.

Enter grayskull, which will eventually replace conda skeleton. Grayskull is a tool to generate concise recipes for Conda Forge - precisely the task at hand! We won't go through the process of creating a new conda environment here. Simply install grayskull from conda forge in a new conda environment as follows:

conda install -c conda-forge grayskull

With grayskull installed, use it to build a meta.yaml (recipe) for your package, by calling it, passing it the repository in which the package presently lives, and the name of the package. In our case:

grayskull pypi spotipy

Upon a successful build, you'll see something like this …

grayskull has successfully built a recipe for the spotipy package

… and there will be a newly created folder with the name of your package. Inside this folder, you'll find a near-perfect recipe to inform your tailoring of the meta.yaml file you've copied in your local clone of the repo above.

At this point, you could either copy and paste the file into your clone of the repo, overwriting the example file above, or edit the example file down using this information as inputs. I suggest the latter, as the recipe that grayskull creates isn't quite perfect and will likely be rejected during the pull request process without some edits. Importantly, in this case, Conda Forge requires minimum version limits for Python because we're building to the 'noarch' (i.e. operating system non-specific) target. The edits are simple enough, though. In the 'requirements' section of the YAML file, add minimum versions to the python lines for both host and run.

… annnnnnnd we're donezo! The final, pull request-ready recipe meta.yaml file for the spotipy package is as follows.

Note the subtle differences between this file and the one generated by grayskull. Again, I recommend editing the file using the output of grayskull rather than copying and pasting to avoid potential issues during the pull request process.

3. Submit a pull request to Conda Forge

Rightho! We're almost finished. All that remains is to submit a pull request to the maintainers of Conda Forge. Before we do, we'll need to commit the changes we've made in our local clone of the GitHub.

Commit the changes to the local repository

Now we'll push the commit back to GitHub.

Push the local commit back to GitHub

Excellent! Now it's time to submit our pull request.

They have a really dope continuous integration (CI) pipeline that automates most of the pull request process! A reviewer from the Conda Forge team will, ultimately, review and approve the request, but there is a heap of great feedback from their automated CI that really speeds the process up.

Important note: do not fully submit a pull request for the spotipy package if you've been hacking-along with this article, as it will be rejected as a duplicate (of the package I've already contributed).

When we select "Pull request" from the GitHub desktop app, we're returned to the GitHub website to complete the process. Click the 'Pull request' link at the top right of the repo.

Initiate a pull request to the Conda Forge repo

The next screen shows you compared changes between the original repo we cloned, and the edits we've made.

We can see the changes that were made via commits to the repo

Click the green "Create pull request" button. In the next screen, provide an informative title for the pull request and work carefully through each item of the checklist to confirm that all of the requirements are met.

Provide an informative title for the pull request
Be sure to work through each of the checklist requirements and confirm they are satisfied

Once the pull request is submitted, the aforementioned slick CI process is kicked off and works through 5 automated steps. These include linting, checking the package builds, then checking that the package builds for each of 3 target operating systems (Linux, OSX and Windows). When these checks successfully complete, you'll be notified. (This process can take as much as half an hour or so … be patient!)

Brilliant! We've passed all 5 of the checks that the CI process ran automatically.

Now … we play the waiting game until we hear back from the maintainers with any additional feedback …

In a little while (a few hours, in this case) the maintainers will respond with either feedback or, if all went well, confirmation that the pull request was merged. Congratulations - you are now a package maintainer for Conda Forge! You'll receive an email invitation to the Conda Forge organization on GitHub, which must be accepted within 7 days. After accepting, the automated CI process will build the package so that it is available at Conda Forge. Here's the spotipy package.

Love it! Here's the spotipy package now incorporated in the Conda Forge repo!

We can also run a quick search from the shell to confirm that the package is available to install via conda

conda search -c conda-forge spotipy
spotipy is ready to install via conda!

… and that's it! I hope you've found this both interesting and accelerating as you make your way into the wonderful world of Python packaging for Conda Forge!

What the huh!? Problems I encountered along the way …

This article comes together in a way that suggests graceful ease of development. As usual, that was certainly not the case. Here are a few things I learned by banging my head against the keyboard during the process of packaging spotipy for Conda Forge.

Screencast forthcoming!

Stay tuned for a video walkthrough of this article!

27 Feb 2021 8:18pm GMT

Andre Roberge: Friendly-traceback: testing with Real Python

Real Python is an excellent learning resource for beginning and intermediate Python programmers that want to learn more about various Python related topics. Most of the resources of RealPython are behind a paywall, but there are many articles available for free. One of the free articles, Invalid Syntax in Python: Common Reasons for SyntaxError, is a good overview of possible causes of syntax errors when using Python. The Real Python article shows code raising exceptions due to syntax errors and provides some explanation for each case.

In this blog post, I reproduce the cases covered in the Real Python article and show the information provided by Friendly-traceback. Ideally, you should read this blog post side by side with the Real Python article, as I mostly focus on showing screen captures, with very little added explanation or background.

If you want to follow along using Friendly-traceback, make sure that you use version 0.2.34 or newer.

Missing comma: first example from the article

The article starts off showing some code leading to this rather terse and uninformative traceback.


Since the code is found in a file, we use python -m friendly_traceback theofficefacts.py to run it and obtain the following.



Misusing the Assignment Operator (=)

We only show one example here, as the others mentioned in the article would be redundant. We remind you for one last time that, if you are not doing so, you should really look at the Real Python article at the same time as you go through this rather terse blog post.



Friendly traceback provides a "hint" right after the traceback. We can get more information by asking why().



Misspelling, Missing, or Misusing Python Keywords

Identifying misspelled keywords was actually inspired by that article from Real Python.



Note that Friendly-traceback identifies "for" as being the most likely misspelled keyword, but gives other possible valid choices.

Friendly-traceback can also identify when break (and return) are used outside a loop.



To the English reader, Friendly-traceback might seem to add very little useful information. However, keep in mind that all this additional information can be translated. If you read the following and do not understand what "boucle" means, then you might get an idea of the some of the challenges faced by non-English speakers when using Python.


In some other cases, like the example given in the Real Python article, Friendly-traceback can identify a missing keyword.


As long as there is only one instance of "in" missing, Friendly-traceback can identify it properly.


Finally, two more cases where a Python keyword is not used properly.



Missing Parentheses, Brackets, and Quotes

Five examples taken from the Real Python article offered without additional comments.






Mistaking Dictionary Syntax


Using the Wrong Indentation

Real Python gives many examples. They would all be handled correctly by Friendly-traceback in a similar way as the single example we decided to use for this post.



Defining and Calling Functions




Changing Python Versions



Friendly-traceback requires Python version 3.6 or newer. Not shown here is that it can recognize that the walrus operator, :=, is not valid before Python version 3.8 and give an appropriate message.


Last example: TypeError result of a syntax error.

Let's look at the last example in the Real Python article.


The explanation given by Friendly-traceback might seem weird "the object (1, 2) was meant to be a function ...". Often one might have assigned a name to that object, which leads to an explanation that should be seen as more reasonable.




The explanation of looking for a "missing comma" when this TypeError is raised was actually added following a suggestion by S. de Menten in the recent contest I held for Friendly-traceback.

There is more ...

Friendly-traceback includes many more cases that those shown above and mentioned in the Real Python article. However, it is limited in that it can only identify the cause of syntax errors there is a single word or symbol used incorrectly or if the error message provided by Python is more informative than the dreaded SyntaxError: invalid syntax.

27 Feb 2021 11:51am GMT

Andre Roberge: Friendly-traceback: testing with Real Python

Real Python is an excellent learning resource for beginning and intermediate Python programmers that want to learn more about various Python related topics. Most of the resources of RealPython are behind a paywall, but there are many articles available for free. One of the free articles, Invalid Syntax in Python: Common Reasons for SyntaxError, is a good overview of possible causes of syntax errors when using Python. The Real Python article shows code raising exceptions due to syntax errors and provides some explanation for each case.

In this blog post, I reproduce the cases covered in the Real Python article and show the information provided by Friendly-traceback. Ideally, you should read this blog post side by side with the Real Python article, as I mostly focus on showing screen captures, with very little added explanation or background.

If you want to follow along using Friendly-traceback, make sure that you use version 0.2.34 or newer.

Missing comma: first example from the article

The article starts off showing some code leading to this rather terse and uninformative traceback.


Since the code is found in a file, we use python -m friendly_traceback theofficefacts.py to run it and obtain the following.



Misusing the Assignment Operator (=)

We only show one example here, as the others mentioned in the article would be redundant. We remind you for one last time that, if you are not doing so, you should really look at the Real Python article at the same time as you go through this rather terse blog post.



Friendly traceback provides a "hint" right after the traceback. We can get more information by asking why().



Misspelling, Missing, or Misusing Python Keywords

Identifying misspelled keywords was actually inspired by that article from Real Python.



Note that Friendly-traceback identifies "for" as being the most likely misspelled keyword, but gives other possible valid choices.

Friendly-traceback can also identify when break (and return) are used outside a loop.



To the English reader, Friendly-traceback might seem to add very little useful information. However, keep in mind that all this additional information can be translated. If you read the following and do not understand what "boucle" means, then you might get an idea of the some of the challenges faced by non-English speakers when using Python.


In some other cases, like the example given in the Real Python article, Friendly-traceback can identify a missing keyword.


As long as there is only one instance of "in" missing, Friendly-traceback can identify it properly.


Finally, two more cases where a Python keyword is not used properly.



Missing Parentheses, Brackets, and Quotes

Five examples taken from the Real Python article offered without additional comments.






Mistaking Dictionary Syntax


Using the Wrong Indentation

Real Python gives many examples. They would all be handled correctly by Friendly-traceback in a similar way as the single example we decided to use for this post.



Defining and Calling Functions




Changing Python Versions



Friendly-traceback requires Python version 3.6 or newer. Not shown here is that it can recognize that the walrus operator, :=, is not valid before Python version 3.8 and give an appropriate message.


Last example: TypeError result of a syntax error.

Let's look at the last example in the Real Python article.


The explanation given by Friendly-traceback might seem weird "the object (1, 2) was meant to be a function ...". Often one might have assigned a name to that object, which leads to an explanation that should be seen as more reasonable.




The explanation of looking for a "missing comma" when this TypeError is raised was actually added following a suggestion by S. de Menten in the recent contest I held for Friendly-traceback.

There is more ...

Friendly-traceback includes many more cases that those shown above and mentioned in the Real Python article. However, it is limited in that it can only identify the cause of syntax errors there is a single word or symbol used incorrectly or if the error message provided by Python is more informative than the dreaded SyntaxError: invalid syntax.

27 Feb 2021 11:51am GMT

Cusy: New: Pattern Matching in Python 3.10

27 Feb 2021 10:04am GMT

Cusy: New: Pattern Matching in Python 3.10

27 Feb 2021 10:04am GMT

Fabio Zadrozny: PyDev 8.2.0 released (external linters, Flake8, path mappings, ...)

PyDev 8.2.0 is now available for download.

This release has many improvements for dealing with external linters.

The main ones are the inclusion of support for the Flake8 linter as well as using a single linter call for analyzing a directory, so, that should be much faster now (previously it called external linters once for each file) .

Note: to request code analysis for all the contents below a folder, right-click it and choose PyDev > Code analysis:

Another change is that comments are now added to the line indentation...

This means that some code as:

  def method():  
if True:
pass

Will become:

  def method():  
# if True:
# pass

p.s.: it's possible to revert to the old behavior by changing the preferences at PyDev > Editor > Code Style > Comments.

Also note that after some feedback, on the next release an option to format such as the code below will also be added (and will probably be made the default):

  def method():  
# if True:
# pass

Interpreter configuration also got a revamp:

So, it's possible to set a given interpreter to be the default one and if you work with conda, select Choose from Conda to select one of your conda environments and configure it in PyDev.

Path mappings for remote debugging can now (finally) be configured from within PyDev itself, so, changing environment variables is no longer needed for that:

Note that Add path mappings template entry may be clicked multiple times to add multiple entries.

That's it... More details may be found at: http://pydev.org.

Hope you enjoy the release 😊


27 Feb 2021 5:44am GMT

Fabio Zadrozny: PyDev 8.2.0 released (external linters, Flake8, path mappings, ...)

PyDev 8.2.0 is now available for download.

This release has many improvements for dealing with external linters.

The main ones are the inclusion of support for the Flake8 linter as well as using a single linter call for analyzing a directory, so, that should be much faster now (previously it called external linters once for each file) .

Note: to request code analysis for all the contents below a folder, right-click it and choose PyDev > Code analysis:

Another change is that comments are now added to the line indentation...

This means that some code as:

  def method():  
if True:
pass

Will become:

  def method():  
# if True:
# pass

p.s.: it's possible to revert to the old behavior by changing the preferences at PyDev > Editor > Code Style > Comments.

Also note that after some feedback, on the next release an option to format such as the code below will also be added (and will probably be made the default):

  def method():  
# if True:
# pass

Interpreter configuration also got a revamp:

So, it's possible to set a given interpreter to be the default one and if you work with conda, select Choose from Conda to select one of your conda environments and configure it in PyDev.

Path mappings for remote debugging can now (finally) be configured from within PyDev itself, so, changing environment variables is no longer needed for that:

Note that Add path mappings template entry may be clicked multiple times to add multiple entries.

That's it... More details may be found at: http://pydev.org.

Hope you enjoy the release 😊


27 Feb 2021 5:44am GMT

26 Feb 2021

feedPlanet Python

Peter Bengtsson: How MDN's site-search works

tl;dr; Periodically, the whole of MDN is built, by our Node code, in a GitHub Action. A Python script bulk-publishes this to Elasticsearch. Our Django server queries the same Elasticsearch via /api/v1/search. The site-search page is a static single-page app that sends XHR requests to the /api/v1/search endpoint. Search results' sort-order is determined by match and "popularity".

Jamstack'ing

The challenge with "Jamstack" websites is with data that is too vast and dynamic that it doesn't make sense to build statically. Search is one of those. For the record, as of Feb 2021, MDN consists of 11,619 documents (aka. articles) in English. Roughly another 40,000 translated documents. In English alone, there are 5.3 million words. So to build a good search experience we need to, as a static site build side-effect, index all of this in a full-text search database. And Elasticsearch is one such database and it's good. In particular, Elasticsearch is something MDN is already quite familiar with because it's what was used from within the Django app when MDN was a wiki.

Note: MDN gets about 20k site-searches per day from within the site.

Build

Diagram

When we build the whole site, it's a script that basically loops over all the raw content, applies macros and fixes, dumps one index.html (via React server-side rendering) and one index.json. The index.json contains all the fully rendered text (as HTML!) in blocks of "prose". It looks something like this:

{
  "doc": {
    "title": "DOCUMENT TITLE",
    "summary": "DOCUMENT SUMMARY",
    "body": [
      {
        "type": "prose", 
        "value": {
          "id": "introduction", 
          "title": "INTRODUCTION",
          "content": "<p>FIRST BLOCK OF TEXTS</p>"
       }
     },
     ...
   ],
   "popularity": 0.12345,
   ...
}

You can see one here: /en-US/docs/Web/index.json

Indexing

Next, after all the index.json files have been produced, a Python script takes over and it traverses all the index.json files and based on that structure it figures out the, title, summary, and the whole body (as HTML).

Next up, before sending this into the bulk-publisher in Elasticsearch it strips the HTML. It's a bit more than just turning <p>Some <em>cool</em> text.</p> to Some cool text. because it also cleans up things like <div class="hidden"> and certain <div class="notecard warning"> blocks.

One thing worth noting is that this whole thing runs roughly every 24 hours and then it builds everything. But what if, between two runs, a certain page has been removed (or moved), how do you remove what was previously added to Elasticsearch? The solution is simple: it deletes and re-creates the index from scratch every day. The whole bulk-publish takes a while so right after the index has been deleted, the searches won't be that great. Someone could be unlucky in that they're searching MDN a couple of seconds after the index was deleted and now waiting for it to build up again.
It's an unfortunate reality but it's a risk worth taking for the sake of simplicity. Also, most people are searching for things in English and specifically the Web/ tree so the bulk-publishing is done in a way the most popular content is bulk-published first and the rest was done after. Here's what the build output logs:

Found 50,461 (potential) documents to index
Deleting any possible existing index and creating a new one called mdn_docs
Took 3m 35s to index 50,362 documents. Approximately 234.1 docs/second
Counts per priority prefixes:
    en-us/docs/web                 9,056
    *rest*                         41,306

So, yes, for 3m 35s there's stuff missing from the index and some unlucky few will get fewer search results than they should. But we can optimize this in the future.

Searching

The way you connect to Elasticsearch is simply by a URL it looks something like this:

https://USER:PASSWD@HASH.us-west-2.aws.found.io:9243

It's an Elasticsearch cluster managed by Elastic running inside AWS. Our job is to make sure that we put the exact same URL in our GitHub Action ("the writer") as we put it into our Django server ("the reader").
In fact, we have 3 Elastic clusters: Prod, Stage, Dev.
And we have 2 Django servers: Prod, Stage.
So we just need to carefully make sure the secrets are set correctly to match the right environment.

Now, in the Django server, we just need to convert a request like GET /api/v1/search?q=foo&locale=fr (for example) to a query to send to Elasticsearch. We have a simple Django view function that validates the query string parameters, does some rate-limiting, creates a query (using elasticsearch-dsl) and packages the Elasticsearch results back to JSON.

How we make that query is important. In here lies the most important feature of the search; how it sorts results.

In one simple explanation, the sort order is a combination of popularity and "matchness". The assumption is that most people want the popular content. I.e. they search for foreach and mean to go to /en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/forEach not /en-US/docs/Web/API/NodeList/forEach both of which contains forEach in the title. The "popularity" is based on Google Analytics pageviews which we download periodically, normalize into a floating-point number between 1 and 0. At the of writing the scoring function does something like this:

rank = doc.popularity * 10 + search.score

This seems to produce pretty reasonable results.

But there's more to the "matchness" too. Elasticsearch has its own API for defining boosting and the way we apply is:

This is then applied on top of whatever else Elasticsearch does such as "Term Frequency" and "Inverse Document Frequency" (tf and if). This article is a helpful introduction.

We're most likely not done with this. There's probably a lot more we can do to tune this myriad of knobs and sliders to get the best possible ranking of documents that match.

Web UI

The last piece of the puzzle is how we display all of this to the user. The way it works is that developer.mozilla.org/$locale/search returns a static page that is blank. As soon as the page has loaded, it lazy-loads JavaScript that can actually issue the XHR request to get and display search results. The code looks something like this:

function SearchResults() {
  const [searchParams] = useSearchParams();
  const sp = createSearchParams(searchParams);
  // add defaults and stuff here
  const fetchURL = `/api/v1/search?${sp.toString()}`;

  const { data, error } = useSWR(
    fetchURL,
    async (url) => {
      const response = await fetch(URL);
      // various checks on the response.statusCode here
      return await response.json();
    }
  );

  // render 'data' or 'error' accordingly here

A lot of interesting details are omitted from this code snippet. You have to check it out for yourself to get a more up-to-date insight into how it actually works. But basically, the window.location (and pushState) query string drives the fetch() call and then all the component has to do is display the search results with some highlighting.

The /api/v1/search endpoint also runs a suggestion query as part of the main search query. This extracts out interest alternative search queries. These are filtered and scored and we issue "sub-queries" just to get a count for each. Now we can do one of those "Did you mean...". For example: search for intersections.

In conclusion

There are a lot of interesting, important, and careful details that are glossed over here in this blog post. It's a constantly evolving system and we're constantly trying to improve and perfect the system in a way that it fits what users expect.

A lot of people reach MDN via a Google search (e.g. mdn array foreach) but despite that, nearly 5% of all traffic on MDN is the site-search functionality. The /$locale/search?... endpoint is the most frequently viewed page of all of MDN. And having a good search engine that's reliable is nevertheless important. By owning and controlling the whole pipeline allows us to do specific things that are unique to MDN that other websites don't need. For example, we index a lot of raw HTML (e.g. <video>) and we have code snippets that needs to be searchable.

Hopefully, the MDN site-search will elevate from being known to be very limited to something now that can genuinely help people get to the exact page better than Google can. Yes, it's worth aiming high!

26 Feb 2021 10:02pm GMT

Peter Bengtsson: How MDN's site-search works

tl;dr; Periodically, the whole of MDN is built, by our Node code, in a GitHub Action. A Python script bulk-publishes this to Elasticsearch. Our Django server queries the same Elasticsearch via /api/v1/search. The site-search page is a static single-page app that sends XHR requests to the /api/v1/search endpoint. Search results' sort-order is determined by match and "popularity".

Jamstack'ing

The challenge with "Jamstack" websites is with data that is too vast and dynamic that it doesn't make sense to build statically. Search is one of those. For the record, as of Feb 2021, MDN consists of 11,619 documents (aka. articles) in English. Roughly another 40,000 translated documents. In English alone, there are 5.3 million words. So to build a good search experience we need to, as a static site build side-effect, index all of this in a full-text search database. And Elasticsearch is one such database and it's good. In particular, Elasticsearch is something MDN is already quite familiar with because it's what was used from within the Django app when MDN was a wiki.

Note: MDN gets about 20k site-searches per day from within the site.

Build

Diagram

When we build the whole site, it's a script that basically loops over all the raw content, applies macros and fixes, dumps one index.html (via React server-side rendering) and one index.json. The index.json contains all the fully rendered text (as HTML!) in blocks of "prose". It looks something like this:

{
  "doc": {
    "title": "DOCUMENT TITLE",
    "summary": "DOCUMENT SUMMARY",
    "body": [
      {
        "type": "prose", 
        "value": {
          "id": "introduction", 
          "title": "INTRODUCTION",
          "content": "<p>FIRST BLOCK OF TEXTS</p>"
       }
     },
     ...
   ],
   "popularity": 0.12345,
   ...
}

You can see one here: /en-US/docs/Web/index.json

Indexing

Next, after all the index.json files have been produced, a Python script takes over and it traverses all the index.json files and based on that structure it figures out the, title, summary, and the whole body (as HTML).

Next up, before sending this into the bulk-publisher in Elasticsearch it strips the HTML. It's a bit more than just turning <p>Some <em>cool</em> text.</p> to Some cool text. because it also cleans up things like <div class="hidden"> and certain <div class="notecard warning"> blocks.

One thing worth noting is that this whole thing runs roughly every 24 hours and then it builds everything. But what if, between two runs, a certain page has been removed (or moved), how do you remove what was previously added to Elasticsearch? The solution is simple: it deletes and re-creates the index from scratch every day. The whole bulk-publish takes a while so right after the index has been deleted, the searches won't be that great. Someone could be unlucky in that they're searching MDN a couple of seconds after the index was deleted and now waiting for it to build up again.
It's an unfortunate reality but it's a risk worth taking for the sake of simplicity. Also, most people are searching for things in English and specifically the Web/ tree so the bulk-publishing is done in a way the most popular content is bulk-published first and the rest was done after. Here's what the build output logs:

Found 50,461 (potential) documents to index
Deleting any possible existing index and creating a new one called mdn_docs
Took 3m 35s to index 50,362 documents. Approximately 234.1 docs/second
Counts per priority prefixes:
    en-us/docs/web                 9,056
    *rest*                         41,306

So, yes, for 3m 35s there's stuff missing from the index and some unlucky few will get fewer search results than they should. But we can optimize this in the future.

Searching

The way you connect to Elasticsearch is simply by a URL it looks something like this:

https://USER:PASSWD@HASH.us-west-2.aws.found.io:9243

It's an Elasticsearch cluster managed by Elastic running inside AWS. Our job is to make sure that we put the exact same URL in our GitHub Action ("the writer") as we put it into our Django server ("the reader").
In fact, we have 3 Elastic clusters: Prod, Stage, Dev.
And we have 2 Django servers: Prod, Stage.
So we just need to carefully make sure the secrets are set correctly to match the right environment.

Now, in the Django server, we just need to convert a request like GET /api/v1/search?q=foo&locale=fr (for example) to a query to send to Elasticsearch. We have a simple Django view function that validates the query string parameters, does some rate-limiting, creates a query (using elasticsearch-dsl) and packages the Elasticsearch results back to JSON.

How we make that query is important. In here lies the most important feature of the search; how it sorts results.

In one simple explanation, the sort order is a combination of popularity and "matchness". The assumption is that most people want the popular content. I.e. they search for foreach and mean to go to /en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/forEach not /en-US/docs/Web/API/NodeList/forEach both of which contains forEach in the title. The "popularity" is based on Google Analytics pageviews which we download periodically, normalize into a floating-point number between 1 and 0. At the of writing the scoring function does something like this:

rank = doc.popularity * 10 + search.score

This seems to produce pretty reasonable results.

But there's more to the "matchness" too. Elasticsearch has its own API for defining boosting and the way we apply is:

This is then applied on top of whatever else Elasticsearch does such as "Term Frequency" and "Inverse Document Frequency" (tf and if). This article is a helpful introduction.

We're most likely not done with this. There's probably a lot more we can do to tune this myriad of knobs and sliders to get the best possible ranking of documents that match.

Web UI

The last piece of the puzzle is how we display all of this to the user. The way it works is that developer.mozilla.org/$locale/search returns a static page that is blank. As soon as the page has loaded, it lazy-loads JavaScript that can actually issue the XHR request to get and display search results. The code looks something like this:

function SearchResults() {
  const [searchParams] = useSearchParams();
  const sp = createSearchParams(searchParams);
  // add defaults and stuff here
  const fetchURL = `/api/v1/search?${sp.toString()}`;

  const { data, error } = useSWR(
    fetchURL,
    async (url) => {
      const response = await fetch(URL);
      // various checks on the response.statusCode here
      return await response.json();
    }
  );

  // render 'data' or 'error' accordingly here

A lot of interesting details are omitted from this code snippet. You have to check it out for yourself to get a more up-to-date insight into how it actually works. But basically, the window.location (and pushState) query string drives the fetch() call and then all the component has to do is display the search results with some highlighting.

The /api/v1/search endpoint also runs a suggestion query as part of the main search query. This extracts out interest alternative search queries. These are filtered and scored and we issue "sub-queries" just to get a count for each. Now we can do one of those "Did you mean...". For example: search for intersections.

In conclusion

There are a lot of interesting, important, and careful details that are glossed over here in this blog post. It's a constantly evolving system and we're constantly trying to improve and perfect the system in a way that it fits what users expect.

A lot of people reach MDN via a Google search (e.g. mdn array foreach) but despite that, nearly 5% of all traffic on MDN is the site-search functionality. The /$locale/search?... endpoint is the most frequently viewed page of all of MDN. And having a good search engine that's reliable is nevertheless important. By owning and controlling the whole pipeline allows us to do specific things that are unique to MDN that other websites don't need. For example, we index a lot of raw HTML (e.g. <video>) and we have code snippets that needs to be searchable.

Hopefully, the MDN site-search will elevate from being known to be very limited to something now that can genuinely help people get to the exact page better than Google can. Yes, it's worth aiming high!

26 Feb 2021 10:02pm GMT

PyCharm: PyCharm and WSL

Over the past few months, I've been monitoring a ticket closely. Over the course of two years, the ticket has accrued over 130 votes. It's the one about WSL support in PyCharm, and by extension, the rest of the JetBrains IDEs. When I say it's "the one", it's because this is the probably the most famous ticket with regards to WSL in our tracker. So, the question is, why is this taking so long to implement?

The History of WSL Support

As things stand right now, WSL and WSL2 are both supported on PyCharm. However, the issue is not with the support itself, but rather how it is supported. WSL is currently supported directly via wsl.exe. We initially used SSH and SFTP to run commands and transfer files. We needed to do this because this was the only way in which we could support WSL at the time.

There were multiple reasons for this. WSL showed tremendous promise for people who wanted to develop on open source technologies. However, we needed to make sure that we could adapt to changes in WSL. At the same time, we were dealing with technology that was not our own, and we needed to be careful about building support that would need to be re-done.

However, the biggest problem stems from a limitation of the IntelliJ platform at the time. IntelliJ expects that it is working with a real file system, and in the case of remote machines, you don't have a real file system.

This is why, we have a copy of the files on your local machine, which is then uploaded via SFTP. This means that whenever you make changes, there will be delays before you can immediately run it.

However, taking a deeper look at this, we begin to see the core of the issue, and that is we need to have a way to support remote development in a better way. By remote, I mean any remote host. This means WSL, but also includes any host on a remote machine and that we would not have to build custom implementations for things like WSL from scratch. This is why, we began working on a project called "Targets".

The Targets API

This new system provides a layer of abstraction over all remote hosts, whether it is WSL, an AWS, GCP or any other machine for that matter. Now, we use the term "remote" loosely here, because to us, a remote is anything that is not the file system or the operating system that PyCharm is running on.

This means that the way to support interpreters will also change fundamentally; it also means that there is a lot of refactoring involved.

Think of the API as a matrix. Not The Matrix, but a matrix. If you want to support a new remote, then you need to start filling out that matrix, and you need to provide answers to how the IDE will handle different scenarios. So, for example, if you wish to add direct support for Docker or WSL, you will need to fill out the entire matrix of behaviours that can be done from the IDE.

Through this approach, we can indeed pave a way for all future remote targets, but it means that the transition to this API will be gradual, as a lot of the current functionality will need to be re-written in order to take advantage of this.

This also means that when complete, cloud providers will have an easier way of adding all kinds of functionality, and editing should become as fluid as editing on the filesystem itself (or so we hope).

Progress Thus Far

Our plan is to implement the Targets API in 2021 although we're still working through a few issues that arise from the implementation. It will implement some basic things such as docker support and remote interpreters, as the year progresses, we hope to add further support for WSL and bring it on part with all other remote targets.

Transcript

Nafiul: [00:00:00] Hello, all you beautiful PyCharmers. This is Early Access PyCharm with your host Nafiul Islam. Today I sit down with three people behind our WSL support and ask them some tough questions because a lot of people really want better support for WSL on PyCharm. So let's get into it.

Ilya: [00:00:26] Well, we started to support WSL as a remote interpreter via SSH
because at the time it was the only way to support it.

Nafiul: [00:00:36] This is Ilya. He's one of the people who works on the remote interpreter team, which supports WSL in PyCharm, along with Vladimir as well as, Alex .

Ilya: [00:00:47] So user had to run open SSH server inside of WSL. And connect to each and they connect to any other remote server.
And I believe a couple of years ago, we switched to a new approach. And so users can now launch the WSL processes directly. Under the hood we run WSL.exe and provide the whole path to the Python interpreter and to this script and so on. This is how it works now.

Nafiul: [00:01:19] So Vladimir, can you just tell me how this all started?
Not the WSL part, but also about remote interpreters in general.

Vladimir: [00:01:30] So it started even before we all had joined JetBrains. The oldest commits I've seen were made at 2012. If I'm not mistaken. So, I believe it's time when it started.

Nafiul: [00:01:45] So is this something that came from the IntelJ platform or was this something that was made by the PyCharm team itself?

Vladimir: [00:01:51] No. As far as I am concerned initially it was made especially for PyCharm and just a few years ago it was moved to the whole platform.

Nafiul: [00:02:04] Okay. So something went out of PyCharm and became accepted in other IDEs. So that's pretty cool. This is not something that usually happens here at JetBrains. Usually it's IntelliJ that builds the platform. And the features just sort of end up in other IDEs.
So the question that I have is when you're using something like WSL or say Apple comes up with a, with a fancy new mechanism for virtualization. We don't know if that's ever going to happen, but essentially what is preventing us from incorporating or providing native support for something like WSL from the get-go.

Ilya: [00:02:49] Well for WSL, we have a couple of problems. The first one that all IntelliJ products are initially configured to work with local files. Even if you have your project on some remote system, you still have to store your files locally and IntelliJ product will copy them to the remote server automatically.

Nafiul: [00:03:11] And how does the sync happen?

Ilya: [00:03:13] There is a special configuration called deployment and IntelliJ monitors your files, and when files are changed, they are copied to their remote server. Or in some cases they are copied before you launch your script.

Nafiul: [00:03:28] So essentially you have to copy the whole file.
You're not changing the files themselves on the server. Like you just do a complete upload. Is that how it works?

Ilya: [00:03:37] Yes. Some products do support very limited file editing on the remote servers. As far as I know PhpStorm support, you can open one file and edit it, but the whole project should be stored on your local machine and you should use your locally installed version control and so on.

Nafiul: [00:04:00] I see. Okay. It makes sense, but explain this to me. You need to copy it back and forth, but so one of the issues that we have with WSL for example, is support for virtual environments, right? That does not seem to be limited by copying and pasting files that are being edited inside of the editor.
So what is kind of holding us back in terms of giving users that support on virtual machines or WSL or whatever.

Ilya: [00:04:31] It's more like a historical problem. We had a very different approach to use it for a virtual environment and different interpreter types. But now we are trying to unify all this things together and want to finish this job.
We should have, like you need API, which will give us ability to create a virtual environment on any interpreter type, be it a WSL or SSH or whatever.

Sasha: [00:05:01] Yes, actually Ilya said exactly what our plan plans are, as for now. There is quite a lot of differences between the local execution and local file system and local file system actions and working with files and executing files with the remote machines.
So basically now we have two different implementations for almost each feature. Like we have some extention points that are implemented differently for local machine and SSH machines. So this, I think this holds us back for some features that we are not exposing to users for remote development, like creating virtualenvs.
But generally the plan is that we are going to provide an API that allows us to use one base code for each of the feature we provide and let this feature run on local machine as well as on SSH and even on Docker or some AWS instances and so on.

Nafiul: [00:06:12] So essentially what you're saying is the reason we haven't solved this problem is because we want to solve this problem, not just for WSL, but for problems like WSL in the future as well.
So that different kinds of machines, virtual, remote… whatever it is … can be supported with a minimum level of effort instead of having to build everything from scratch over and over again. Am I correct in understanding that?

Sasha: [00:06:40] Yeah, it is quite correct.

Nafiul: [00:06:43] So how difficult is this?

Sasha: [00:06:46] As we already have a lot of source code for different type of targets that we have, like local machine, SSH, Docker.
We need to bring all this together and get a single code for each of these features and hide the differences of these targets under the API implementation. So ..

Nafiul: [00:07:11] what you're telling me is you have to change a lot of existing code, make sure that that doesn't break, unify all of that into a framework and then support all the stuff we already support.
And then you can have WSL.

Sasha: [00:07:29] I mean, then we will have some WSL features that we don't have now, because now we have a WSL support for project execution

Nafiul: [00:07:39] Yes, absolutely. But essentially what I'm saying is a lot of the features that we have right now will probably need to be reimplemented in order for everything to work and that we'll probably need to be tested.

Is that what you're telling me? Like the mother of all refactorings.
Sasha: [00:07:57] Yeah, something like that. We did a lot of refactorings for example, for SSH subsystem, I started it some time ago, I think three years ago. And then, Vladimir came to our company, joined…

Nafiul: [00:08:10] You basically made him do all the hard work. Is that what you're saying?

Sasha: [00:08:13] Yes, he made the next iteration, actually, of the refactoring. So yeah. We've got a lot of refactoring tasks and because we face new problems and sometimes it requires complete, not complete, but a general rewrite of the code. Yeah.

Nafiul: [00:08:34] Okay. That's that seems like a lot of work. So the question that I have is once this target API is done, Does that mean whenever somebody comes out with a new cloud, with a new way of doing things, with a new API, say for IBM cloud or for XYZ cloud or whatever, it will be far easier for them also to implement functionality within PyCharm.

Vladimir: [00:09:01] Yes. I believe the whole idea of targets API is to generalize infrastructure for running process, for synchronized files from some high-level syncs like virtual environments, like path interpreters and so on. So yes, we want to make a simple API that would allow various cloud companies like IBM cloud, like Amazon and so on and so on just to implement some interface about running some extra process, about synchronizing files between machines and we'll keep all the things about virtualenv and so on away from that API.

Nafiul: [00:09:50] I see, well, thank you very much, Vova, Ilya and Alexander. Thank you for answering some very tough questions and I hope to book you again soon.

Ilya: [00:09:59] Bye!

Nafiul: [00:10:00] And thank you for listening. If you want more of these podcasts, let us know on Twitter.

The post PyCharm and WSL first appeared on JetBrains Blog.

26 Feb 2021 6:27pm GMT

PyCharm: PyCharm and WSL

Over the past few months, I've been monitoring a ticket closely. Over the course of two years, the ticket has accrued over 130 votes. It's the one about WSL support in PyCharm, and by extension, the rest of the JetBrains IDEs. When I say it's "the one", it's because this is the probably the most famous ticket with regards to WSL in our tracker. So, the question is, why is this taking so long to implement?

The History of WSL Support

As things stand right now, WSL and WSL2 are both supported on PyCharm. However, the issue is not with the support itself, but rather how it is supported. WSL is currently supported directly via wsl.exe. We initially used SSH and SFTP to run commands and transfer files. We needed to do this because this was the only way in which we could support WSL at the time.

There were multiple reasons for this. WSL showed tremendous promise for people who wanted to develop on open source technologies. However, we needed to make sure that we could adapt to changes in WSL. At the same time, we were dealing with technology that was not our own, and we needed to be careful about building support that would need to be re-done.

However, the biggest problem stems from a limitation of the IntelliJ platform at the time. IntelliJ expects that it is working with a real file system, and in the case of remote machines, you don't have a real file system.

This is why, we have a copy of the files on your local machine, which is then uploaded via SFTP. This means that whenever you make changes, there will be delays before you can immediately run it.

However, taking a deeper look at this, we begin to see the core of the issue, and that is we need to have a way to support remote development in a better way. By remote, I mean any remote host. This means WSL, but also includes any host on a remote machine and that we would not have to build custom implementations for things like WSL from scratch. This is why, we began working on a project called "Targets".

The Targets API

This new system provides a layer of abstraction over all remote hosts, whether it is WSL, an AWS, GCP or any other machine for that matter. Now, we use the term "remote" loosely here, because to us, a remote is anything that is not the file system or the operating system that PyCharm is running on.

This means that the way to support interpreters will also change fundamentally; it also means that there is a lot of refactoring involved.

Think of the API as a matrix. Not The Matrix, but a matrix. If you want to support a new remote, then you need to start filling out that matrix, and you need to provide answers to how the IDE will handle different scenarios. So, for example, if you wish to add direct support for Docker or WSL, you will need to fill out the entire matrix of behaviours that can be done from the IDE.

Through this approach, we can indeed pave a way for all future remote targets, but it means that the transition to this API will be gradual, as a lot of the current functionality will need to be re-written in order to take advantage of this.

This also means that when complete, cloud providers will have an easier way of adding all kinds of functionality, and editing should become as fluid as editing on the filesystem itself (or so we hope).

Progress Thus Far

Our plan is to implement the Targets API in 2021 although we're still working through a few issues that arise from the implementation. It will implement some basic things such as docker support and remote interpreters, as the year progresses, we hope to add further support for WSL and bring it on part with all other remote targets.

Transcript

Nafiul: [00:00:00] Hello, all you beautiful PyCharmers. This is Early Access PyCharm with your host Nafiul Islam. Today I sit down with three people behind our WSL support and ask them some tough questions because a lot of people really want better support for WSL on PyCharm. So let's get into it.

Ilya: [00:00:26] Well, we started to support WSL as a remote interpreter via SSH
because at the time it was the only way to support it.

Nafiul: [00:00:36] This is Ilya. He's one of the people who works on the remote interpreter team, which supports WSL in PyCharm, along with Vladimir as well as, Alex .

Ilya: [00:00:47] So user had to run open SSH server inside of WSL. And connect to each and they connect to any other remote server.
And I believe a couple of years ago, we switched to a new approach. And so users can now launch the WSL processes directly. Under the hood we run WSL.exe and provide the whole path to the Python interpreter and to this script and so on. This is how it works now.

Nafiul: [00:01:19] So Vladimir, can you just tell me how this all started?
Not the WSL part, but also about remote interpreters in general.

Vladimir: [00:01:30] So it started even before we all had joined JetBrains. The oldest commits I've seen were made at 2012. If I'm not mistaken. So, I believe it's time when it started.

Nafiul: [00:01:45] So is this something that came from the IntelJ platform or was this something that was made by the PyCharm team itself?

Vladimir: [00:01:51] No. As far as I am concerned initially it was made especially for PyCharm and just a few years ago it was moved to the whole platform.

Nafiul: [00:02:04] Okay. So something went out of PyCharm and became accepted in other IDEs. So that's pretty cool. This is not something that usually happens here at JetBrains. Usually it's IntelliJ that builds the platform. And the features just sort of end up in other IDEs.
So the question that I have is when you're using something like WSL or say Apple comes up with a, with a fancy new mechanism for virtualization. We don't know if that's ever going to happen, but essentially what is preventing us from incorporating or providing native support for something like WSL from the get-go.

Ilya: [00:02:49] Well for WSL, we have a couple of problems. The first one that all IntelliJ products are initially configured to work with local files. Even if you have your project on some remote system, you still have to store your files locally and IntelliJ product will copy them to the remote server automatically.

Nafiul: [00:03:11] And how does the sync happen?

Ilya: [00:03:13] There is a special configuration called deployment and IntelliJ monitors your files, and when files are changed, they are copied to their remote server. Or in some cases they are copied before you launch your script.

Nafiul: [00:03:28] So essentially you have to copy the whole file.
You're not changing the files themselves on the server. Like you just do a complete upload. Is that how it works?

Ilya: [00:03:37] Yes. Some products do support very limited file editing on the remote servers. As far as I know PhpStorm support, you can open one file and edit it, but the whole project should be stored on your local machine and you should use your locally installed version control and so on.

Nafiul: [00:04:00] I see. Okay. It makes sense, but explain this to me. You need to copy it back and forth, but so one of the issues that we have with WSL for example, is support for virtual environments, right? That does not seem to be limited by copying and pasting files that are being edited inside of the editor.
So what is kind of holding us back in terms of giving users that support on virtual machines or WSL or whatever.

Ilya: [00:04:31] It's more like a historical problem. We had a very different approach to use it for a virtual environment and different interpreter types. But now we are trying to unify all this things together and want to finish this job.
We should have, like you need API, which will give us ability to create a virtual environment on any interpreter type, be it a WSL or SSH or whatever.

Sasha: [00:05:01] Yes, actually Ilya said exactly what our plan plans are, as for now. There is quite a lot of differences between the local execution and local file system and local file system actions and working with files and executing files with the remote machines.
So basically now we have two different implementations for almost each feature. Like we have some extention points that are implemented differently for local machine and SSH machines. So this, I think this holds us back for some features that we are not exposing to users for remote development, like creating virtualenvs.
But generally the plan is that we are going to provide an API that allows us to use one base code for each of the feature we provide and let this feature run on local machine as well as on SSH and even on Docker or some AWS instances and so on.

Nafiul: [00:06:12] So essentially what you're saying is the reason we haven't solved this problem is because we want to solve this problem, not just for WSL, but for problems like WSL in the future as well.
So that different kinds of machines, virtual, remote… whatever it is … can be supported with a minimum level of effort instead of having to build everything from scratch over and over again. Am I correct in understanding that?

Sasha: [00:06:40] Yeah, it is quite correct.

Nafiul: [00:06:43] So how difficult is this?

Sasha: [00:06:46] As we already have a lot of source code for different type of targets that we have, like local machine, SSH, Docker.
We need to bring all this together and get a single code for each of these features and hide the differences of these targets under the API implementation. So ..

Nafiul: [00:07:11] what you're telling me is you have to change a lot of existing code, make sure that that doesn't break, unify all of that into a framework and then support all the stuff we already support.
And then you can have WSL.

Sasha: [00:07:29] I mean, then we will have some WSL features that we don't have now, because now we have a WSL support for project execution

Nafiul: [00:07:39] Yes, absolutely. But essentially what I'm saying is a lot of the features that we have right now will probably need to be reimplemented in order for everything to work and that we'll probably need to be tested.

Is that what you're telling me? Like the mother of all refactorings.
Sasha: [00:07:57] Yeah, something like that. We did a lot of refactorings for example, for SSH subsystem, I started it some time ago, I think three years ago. And then, Vladimir came to our company, joined…

Nafiul: [00:08:10] You basically made him do all the hard work. Is that what you're saying?

Sasha: [00:08:13] Yes, he made the next iteration, actually, of the refactoring. So yeah. We've got a lot of refactoring tasks and because we face new problems and sometimes it requires complete, not complete, but a general rewrite of the code. Yeah.

Nafiul: [00:08:34] Okay. That's that seems like a lot of work. So the question that I have is once this target API is done, Does that mean whenever somebody comes out with a new cloud, with a new way of doing things, with a new API, say for IBM cloud or for XYZ cloud or whatever, it will be far easier for them also to implement functionality within PyCharm.

Vladimir: [00:09:01] Yes. I believe the whole idea of targets API is to generalize infrastructure for running process, for synchronized files from some high-level syncs like virtual environments, like path interpreters and so on. So yes, we want to make a simple API that would allow various cloud companies like IBM cloud, like Amazon and so on and so on just to implement some interface about running some extra process, about synchronizing files between machines and we'll keep all the things about virtualenv and so on away from that API.

Nafiul: [00:09:50] I see, well, thank you very much, Vova, Ilya and Alexander. Thank you for answering some very tough questions and I hope to book you again soon.

Ilya: [00:09:59] Bye!

Nafiul: [00:10:00] And thank you for listening. If you want more of these podcasts, let us know on Twitter.

The post PyCharm and WSL first appeared on JetBrains Blog.

26 Feb 2021 6:27pm GMT

Python Software Foundation: Python Software Foundation Fellow Members for Q4 2020

It's that time of year! Let us welcome the new PSF Fellows for Q4! The following people continue to do amazing things for the Python community:

Batuhan Taskaya

Twitter, GitHub

Elaine Wong

Twitter, LinkedIn, GitHub, Website

Fiorella De Luca
Twitter, LinkedIn

Nicole Harris

Twitter, Website

Pablo Rivera

LinkedIn

Philip James

Twitter, GitHub, Website

Thank you for your continued contributions. We have added you to our Fellow roster online.

The above members help support the Python ecosystem by contributing to CPython, contributing to the PyLadies community, maintaining Python libraries, creating educational material, improving UX/UI for our infrastructure, organizing Python events and conferences, starting Python communities in local regions, and overall being great mentors in our community. Each of them continues to help make Python more accessible around the world. To learn more about the new Fellow members, check out their links above.

Let's continue to recognize Pythonistas all over the world for their impact on our community. The criteria for Fellow members is available online: https://www.python.org/psf/fellows/. If you would like to nominate someone to be a PSF Fellow, please send a description of their Python accomplishments and their email address to psf-fellow at python.org. We are accepting nominations for quarter 2 through May 20, 2021 (Q1 cut-off has already passed!).

Work Group Needs Members

The Fellow Work Group is looking for more members from all around the world! If you are a PSF Fellow and would like to help review nominations, please email us at psf-fellow at python.org. More information is available at: https://www.python.org/psf/fellows/.

26 Feb 2021 4:52pm GMT

Python Software Foundation: Python Software Foundation Fellow Members for Q4 2020

It's that time of year! Let us welcome the new PSF Fellows for Q4! The following people continue to do amazing things for the Python community:

Batuhan Taskaya

Twitter, GitHub

Elaine Wong

Twitter, LinkedIn, GitHub, Website

Fiorella De Luca
Twitter, LinkedIn

Nicole Harris

Twitter, Website

Pablo Rivera

LinkedIn

Philip James

Twitter, GitHub, Website

Thank you for your continued contributions. We have added you to our Fellow roster online.

The above members help support the Python ecosystem by contributing to CPython, contributing to the PyLadies community, maintaining Python libraries, creating educational material, improving UX/UI for our infrastructure, organizing Python events and conferences, starting Python communities in local regions, and overall being great mentors in our community. Each of them continues to help make Python more accessible around the world. To learn more about the new Fellow members, check out their links above.

Let's continue to recognize Pythonistas all over the world for their impact on our community. The criteria for Fellow members is available online: https://www.python.org/psf/fellows/. If you would like to nominate someone to be a PSF Fellow, please send a description of their Python accomplishments and their email address to psf-fellow at python.org. We are accepting nominations for quarter 2 through May 20, 2021 (Q1 cut-off has already passed!).

Work Group Needs Members

The Fellow Work Group is looking for more members from all around the world! If you are a PSF Fellow and would like to help review nominations, please email us at psf-fellow at python.org. More information is available at: https://www.python.org/psf/fellows/.

26 Feb 2021 4:52pm GMT

10 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: King Willams Town Bahnhof

Gestern musste ich morgens zur Station nach KWT um unsere Rerservierten Bustickets für die Weihnachtsferien in Capetown abzuholen. Der Bahnhof selber ist seit Dezember aus kostengründen ohne Zugverbindung - aber Translux und co - die langdistanzbusse haben dort ihre Büros.


Größere Kartenansicht




© benste CC NC SA

10 Nov 2011 10:57am GMT

09 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein

Niemand ist besorgt um so was - mit dem Auto fährt man einfach durch, und in der City - nahe Gnobie- "ne das ist erst gefährlich wenn die Feuerwehr da ist" - 30min später auf dem Rückweg war die Feuerwehr da.




© benste CC NC SA

09 Nov 2011 8:25pm GMT

08 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Brai Party

Brai = Grillabend o.ä.

Die möchte gern Techniker beim Flicken ihrer SpeakOn / Klinke Stecker Verzweigungen...

Die Damen "Mamas" der Siedlung bei der offiziellen Eröffnungsrede

Auch wenn weniger Leute da waren als erwartet, Laute Musik und viele Leute ...

Und natürlich ein Feuer mit echtem Holz zum Grillen.

© benste CC NC SA

08 Nov 2011 2:30pm GMT

07 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Lumanyano Primary

One of our missions was bringing Katja's Linux Server back to her room. While doing that we saw her new decoration.

Björn, Simphiwe carried the PC to Katja's school


© benste CC NC SA

07 Nov 2011 2:00pm GMT

06 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Nelisa Haircut

Today I went with Björn to Needs Camp to Visit Katja's guest family for a special Party. First of all we visited some friends of Nelisa - yeah the one I'm working with in Quigney - Katja's guest fathers sister - who did her a haircut.

African Women usually get their hair done by arranging extensions and not like Europeans just cutting some hair.

In between she looked like this...

And then she was done - looks amazing considering the amount of hair she had last week - doesn't it ?

© benste CC NC SA

06 Nov 2011 7:45pm GMT

05 Nov 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Mein Samstag

Irgendwie viel mir heute auf das ich meine Blogposts mal ein bischen umstrukturieren muss - wenn ich immer nur von neuen Plätzen berichte, dann müsste ich ja eine Rundreise machen. Hier also mal ein paar Sachen aus meinem heutigen Alltag.

Erst einmal vorweg, Samstag zählt zumindest für uns Voluntäre zu den freien Tagen.

Dieses Wochenende sind nur Rommel und ich auf der Farm - Katja und Björn sind ja mittlerweile in ihren Einsatzstellen, und meine Mitbewohner Kyle und Jonathan sind zu Hause in Grahamstown - sowie auch Sipho der in Dimbaza wohnt.
Robin, die Frau von Rommel ist in Woodie Cape - schon seit Donnerstag um da ein paar Sachen zur erledigen.
Naja wie dem auch sei heute morgen haben wir uns erstmal ein gemeinsames Weetbix/Müsli Frühstück gegönnt und haben uns dann auf den Weg nach East London gemacht. 2 Sachen waren auf der Checkliste Vodacom, Ethienne (Imobilienmakler) außerdem auf dem Rückweg die fehlenden Dinge nach NeedsCamp bringen.

Nachdem wir gerade auf der Dirtroad losgefahren sind mussten wir feststellen das wir die Sachen für Needscamp und Ethienne nicht eingepackt hatten aber die Pumpe für die Wasserversorgung im Auto hatten.

Also sind wir in EastLondon ersteinmal nach Farmerama - nein nicht das onlinespiel farmville - sondern einen Laden mit ganz vielen Sachen für eine Farm - in Berea einem nördlichen Stadteil gefahren.

In Farmerama haben wir uns dann beraten lassen für einen Schnellverschluss der uns das leben mit der Pumpe leichter machen soll und außerdem eine leichtere Pumpe zur Reperatur gebracht, damit es nicht immer so ein großer Aufwand ist, wenn mal wieder das Wasser ausgegangen ist.

Fego Caffé ist in der Hemmingways Mall, dort mussten wir und PIN und PUK einer unserer Datensimcards geben lassen, da bei der PIN Abfrage leider ein zahlendreher unterlaufen ist. Naja auf jeden Fall speichern die Shops in Südafrika so sensible Daten wie eine PUK - die im Prinzip zugang zu einem gesperrten Phone verschafft.

Im Cafe hat Rommel dann ein paar online Transaktionen mit dem 3G Modem durchgeführt, welches ja jetzt wieder funktionierte - und übrigens mittlerweile in Ubuntu meinem Linuxsystem perfekt klappt.

Nebenbei bin ich nach 8ta gegangen um dort etwas über deren neue Deals zu erfahren, da wir in einigen von Hilltops Centern Internet anbieten wollen. Das Bild zeigt die Abdeckung UMTS in NeedsCamp Katjas Ort. 8ta ist ein neuer Telefonanbieter von Telkom, nachdem Vodafone sich Telkoms anteile an Vodacom gekauft hat müssen die komplett neu aufbauen.
Wir haben uns dazu entschieden mal eine kostenlose Prepaidkarte zu testen zu organisieren, denn wer weis wie genau die Karte oben ist ... Bevor man einen noch so billigen Deal für 24 Monate signed sollte man wissen obs geht.

Danach gings nach Checkers in Vincent, gesucht wurden zwei Hotplates für WoodyCape - R 129.00 eine - also ca. 12€ für eine zweigeteilte Kochplatte.
Wie man sieht im Hintergrund gibts schon Weihnachtsdeko - Anfang November und das in Südafrika bei sonnig warmen min- 25°C

Mittagessen haben wir uns bei einem Pakistanischen Curry Imbiss gegönnt - sehr empfehlenswert !
Naja und nachdem wir dann vor ner Stunde oder so zurück gekommen sind habe ich noch den Kühlschrank geputzt den ich heute morgen zum defrosten einfach nach draußen gestellt hatte. Jetzt ist der auch mal wieder sauber und ohne 3m dicke Eisschicht...

Morgen ... ja darüber werde ich gesondert berichten ... aber vermutlich erst am Montag, denn dann bin ich nochmal wieder in Quigney(East London) und habe kostenloses Internet.

© benste CC NC SA

05 Nov 2011 4:33pm GMT

31 Oct 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Sterkspruit Computer Center

Sterkspruit is one of Hilltops Computer Centres in the far north of Eastern Cape. On the trip to J'burg we've used the opportunity to take a look at the centre.

Pupils in the big classroom


The Trainer


School in Countryside


Adult Class in the Afternoon


"Town"


© benste CC NC SA

31 Oct 2011 4:58pm GMT

Benedict Stein: Technical Issues

What are you doing in an internet cafe if your ADSL and Faxline has been discontinued before months end. Well my idea was sitting outside and eating some ice cream.
At least it's sunny and not as rainy as on the weekend.


© benste CC NC SA

31 Oct 2011 3:11pm GMT

30 Oct 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Nellis Restaurant

For those who are traveling through Zastron - there is a very nice Restaurant which is serving delicious food at reasanable prices.
In addition they're selling home made juices jams and honey.




interior


home made specialities - the shop in the shop


the Bar


© benste CC NC SA

30 Oct 2011 4:47pm GMT

29 Oct 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: The way back from J'burg

Having the 10 - 12h trip from J'burg back to ELS I was able to take a lot of pcitures including these different roadsides

Plain Street


Orange River in its beginngings (near Lesotho)


Zastron Anglican Church


The Bridge in Between "Free State" and Eastern Cape next to Zastron


my new Background ;)


If you listen to GoogleMaps you'll end up traveling 50km of gravel road - as it was just renewed we didn't have that many problems and saved 1h compared to going the official way with all it's constructions sites




Freeway


getting dark


© benste CC NC SA

29 Oct 2011 4:23pm GMT

28 Oct 2011

feedPython Software Foundation | GSoC'11 Students

Benedict Stein: Wie funktioniert eigentlich eine Baustelle ?

Klar einiges mag anders sein, vieles aber gleich - aber ein in Deutschland täglich übliches Bild einer Straßenbaustelle - wie läuft das eigentlich in Südafrika ?

Ersteinmal vorweg - NEIN keine Ureinwohner die mit den Händen graben - auch wenn hier mehr Manpower genutzt wird - sind sie fleißig mit Technologie am arbeiten.

Eine ganz normale "Bundesstraße"


und wie sie erweitert wird


gaaaanz viele LKWs


denn hier wird eine Seite über einen langen Abschnitt komplett gesperrt, so das eine Ampelschaltung mit hier 45 Minuten Wartezeit entsteht


Aber wenigstens scheinen die ihren Spaß zu haben ;) - Wie auch wir denn gücklicher Weise mussten wir nie länger als 10 min. warten.

© benste CC NC SA

28 Oct 2011 4:20pm GMT