Page MenuHomePhabricator

MGerlach (Martin Gerlach)
Senior Research Scientist

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Sep 9 2019, 9:50 AM (264 w, 12 h)
Availability
Available
IRC Nick
mgerlach
LDAP User
MGerlach
MediaWiki User
MGerlach (WMF) [ Global Accounts ]

Recent Activity

Aug 16 2024

MGerlach added a comment to T369292: Research support for hypothesis in KR WE.3.1 (Q1).

weekly update

  • no updates this week as I was attending ACL 2024 conference.
Aug 16 2024, 8:27 AM · OKR-Work, Research (FY2024-25-Research-July-September)
MGerlach added a comment to T369288: [WE.3.1.3] Building a model for content simplification (Q1).

weekly update

  • no updates this week as I was attending ACL 2024 conference.
Aug 16 2024, 8:26 AM · OKR-Work, Research (FY2024-25-Research-July-September)

Aug 8 2024

MGerlach added a comment to T369292: Research support for hypothesis in KR WE.3.1 (Q1).

weekly update:

  • no updates this week as there were no requests for additional support so far
Aug 8 2024, 3:53 PM · OKR-Work, Research (FY2024-25-Research-July-September)
MGerlach added a comment to T369288: [WE.3.1.3] Building a model for content simplification (Q1).

weekly update:

  • Updated the project page on meta with current status: https://meta.wikimedia.org/wiki/Research:Develop_a_model_for_text_simplification_to_improve_readability_of_Wikipedia_articles/FY24-25_WE.3.1.3_content_simplification
  • Identified informative/interpretable metrics to evaluate performance of simplification models. For the manual evaluation, the most common approach is to judge 3 dimensions: simplicity (is the text simpler), fluency (is the text grammatical), and adequacy (does the text preserve the meaning). Looking into the literature, we can define automated metrics which approximate judgements along these dimensions.
    • Simplicity: Measure the change in readability score using our readability scoring model for Wikipedia articles.
    • Fluency: Count the number of grammatical errors using LanguageTool
    • Adequacy: Measure the factual consistency between the original and simplified text to detect, e.g., “model hallucination”. This is a very active field of research and several methods have been proposed recently such as FactCC (based on a trained classification model), SummaC (based on textual entailment), or QuestEval (based on question generation and answering).
  • The advantage of these metrics is that they are more interpretable and that they dont require reference samples from a ground truth dataset. This will hopefully make it easier to obtain confidence about whether models are “good enough” for potential use in production.
  • Started to implement metrics so we can automatically measure model performance.
  • Next step: Finalize implementation of metrics for evaluating model performance and run on representative sample with existing model.
Aug 8 2024, 3:50 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Aug 2 2024

MGerlach claimed T364486: Update the Research Handbook to clarify how to handle news coverage of our research.

Update: I added a draft section about this to the handbook https://office.wikimedia.org/wiki/Research/Handbook/Communication#Communicating_with_public_media

Aug 2 2024, 3:37 PM · Research-management, Research-outreach, Research
MGerlach added a comment to T369292: Research support for hypothesis in KR WE.3.1 (Q1).

weekly update:

  • I put together documentation for article recommendations with different tools from Research for experiments WE.3.1.1 (doc)
  • Shared documentation with hypothesis owner for feedback.
Aug 2 2024, 3:16 PM · OKR-Work, Research (FY2024-25-Research-July-September)
MGerlach added a comment to T369288: [WE.3.1.3] Building a model for content simplification (Q1).

weekly update:

  • Built two working model prototypes for content simplification/summarization
    • 1) Simplification: Generate a simplified version of a section/paragraph using simpler language.
    • 2) Section-gist: Generate a plain language summary of a section (i.e. combining simplification and summarization)
  • Put together detailed documentation about the two models with examples and tutorial notebooks on how the models can be run (doc). I will add these updates to the project-meta page too in the next week or so.
  • Exploring alternatives for automatic evaluation of models to make it easier to iterate through different model variants without the need for manual evaluation.
  • Experimenting with the test-deployment in the staging instance of LiftWing (available thanks to the ML-Team) of the Aya-23 model (Aya-23-8B) which is used for the section gist model. The model works and returns requests in a reasonable time. Though the model quality seems substantially lower than the larger model I used in the experiments using the external API. ML Team is planning to test-deploy the larger model version (Aya-23-35B) in the next weeks.
Aug 2 2024, 3:15 PM · OKR-Work, Research (FY2024-25-Research-July-September)
MGerlach updated the task description for T369288: [WE.3.1.3] Building a model for content simplification (Q1).
Aug 2 2024, 3:13 PM · OKR-Work, Research (FY2024-25-Research-July-September)
MGerlach closed T364852: Feedback on Edge Uniques draft design document as Resolved.

Update: Fabian, Isaac, and I coordinated and left detailed feedback and comments in the design doc. Resolving the task as the request in the task description is completed. We will of course continue to engage in any follow-up discussions in the doc.

Aug 2 2024, 2:28 PM · OKR-Work, Research

Jul 29 2024

MGerlach added a comment to T219903: Keep research.wikimedia.org landing page updated.

For the next round of updates, could you add the following items:

  • Add to Publications page and Knowledge Gaps publications:
  • Also add blurb to Knowledge Gaps updates:
    • Date: Aug 2024
    • Title: A multilingual model for measuring readability
    • Blurb: We published a new paper at ACL ‘24 where we develop a multilingual model to score the readability of Wikipedia articles across languages.
    • Link: https://arxiv.org/abs/2406.01835
Jul 29 2024, 12:07 PM · Patch-For-Review, periodic-update, Research

Jul 26 2024

MGerlach added a comment to T369292: Research support for hypothesis in KR WE.3.1 (Q1).

weekly update

  • Shared results on section-gists of Wikipedia articles (T369288#10018291) with Web team as one potential approach for experiments on summarization and simplification for readers.
Jul 26 2024, 3:25 PM · OKR-Work, Research (FY2024-25-Research-July-September)
MGerlach added a comment to T369288: [WE.3.1.3] Building a model for content simplification (Q1).

weekly update:

  • Ran small-scale experiments to automatically generate section-gists (i.e. plain-language summaries of a section) for Wikipedia articles using different models.
    • Test-dataset with original and simplified text from only the lead section of 10 articles in English, German, Portuguese (see spreadsheet)
    • Sample-dataset of several sections from the same article without the reference-simplification (see spreadsheet)
  • We are able to automatically generate section-gists (plain-language summaries of a section) of Wikipedia articles across languages using the LLM Aya-23. Based on qualitative evaluation of small sample (<100), the results are promising that the model can provide a concise and easy-to-understand overview of a long piece of text. I successfully tested English, German, and Portuguese, but the model supports 23 languages supposedly covering nearly half the world’s population in terms of speakers (which is way more than any other comparable LLM that I am aware of).
  • Evaluation of text simplification is challenging. Automated metrics (such as SARI) dont seem to be very useful to rigorously assess model performance for specific use cases. Simple baselines such as returning a truncated version of the input-text can yield surprisingly good results. As a result, we should probably not rely on these metrics and, instead, resort to manually judging (samples of) the model output which is, however, much more resource-intensive.
Jul 26 2024, 3:22 PM · OKR-Work, Research (FY2024-25-Research-July-September)
MGerlach updated the task description for T369288: [WE.3.1.3] Building a model for content simplification (Q1).
Jul 26 2024, 3:20 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Jul 19 2024

MGerlach added a comment to T369292: Research support for hypothesis in KR WE.3.1 (Q1).

weekly update:

  • Android reached out to understand more about recommended content within search. I shared some resources from research on search and articles recommendations (e.g. from list building)
  • Web is starting with first experiments on recommendations in search. Providing support for using the article-similarity search from the list-building tool https://list-building.toolforge.org/
  • Web is preparing to start thinking about experiments on simplifications which will happen later in the quarter. Ongoing discussions about what simplification would be useful and how to evaluate.
Jul 19 2024, 3:19 PM · OKR-Work, Research (FY2024-25-Research-July-September)
MGerlach added a comment to T369288: [WE.3.1.3] Building a model for content simplification (Q1).

weekly summary:

  • Clarified the criteria and constraints for the model
    • Multilingual: support (some) languages other than English
    • Openness: it needs to be open.
    • Resources: We need to be able to host the model in our infrastructure in LiftWing with reasonable inference time.
    • Use-case: We need to define the use-case; for example, should simplification be on the level of sentence, paragraph, section, or the full article?
    • Quality: Ensure the output is useful in practice according to some metric
  • Did a deep-dive on recent works on text simplification with language models (reviewing 26 papers from the past 2-3 years see below [1]). This helped me to understand the most common and promising strategies to approach the task and to identify challenges.
  • Some of the main learnings came from a paper (Paper Plain) which aims to improve access to medical papers. Based on interviews with readers about barriers for interacting with content and usability testing, they identify section gists as a valuable and most frequently-used feature by non-expert readers.
    • Operationalize simplification as a plain-language summary of a section
    • For example, use a prompt: “Summarize content you are provided with for a fifth-grade student.”
    • This approach seems highly effective as most LLMs are explicitly trained on the task of summarization across languages (e.g. XLSum)
  • I also reviewed some of the recent large language models that could be good candidates. I identified the Aya-23 model as a promising candidate model
    • Multilingual: It is multilingual supporting 23 languages (these languages cover approximately half the world’s population)
    • It is an open-weight model and available in Hugging Face
    • Given the successful test-deployment of the similarly-sized Gemma2-27B-it on LiftWing (T369055), it seems that this model could in principle be also hosted.
    • The model can be prompted to generate plain-language summaries of sections without fine-tuning
    • Some initial tests with the API-endpoint looked promising across several languages.
Jul 19 2024, 2:55 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Jul 12 2024

MGerlach closed T361942: Run revised 3rd survey on readability perception, a subtask of T325815: Understanding perception of readability in Wikipedia, as Resolved.
Jul 12 2024, 2:35 PM · Research
MGerlach closed T361942: Run revised 3rd survey on readability perception as Resolved.

weekly update:

Jul 12 2024, 2:35 PM · Research (FY2024-25-Research-July-September)
MGerlach added a comment to T369292: Research support for hypothesis in KR WE.3.1 (Q1).

weekly update:

  • I reached out to all hypothesis owners in WE.3.1 individually We also had a joint meeting to support Support WE.3.1. From this, I obtained a much clearer picture about the needed support
  • WE.3.1.1: might need light support (consulting) for options to generate recommendations. Substantial support needed for experiments on simplification/summarization. Web Team is starting to think about specifications in more detail. So these are ongoing discussions at the moment.
  • WE.3.1.4: No support needed at this point. They will focus on figuring out what work would need to be done for scaling search (e.g. morelike). They also want to start looking into vector search which would likely require some support from Research (e.g. creating vectors/embeddings). However, they are starting from scratch and during Q1 will start to figure out what they want and what support would be needed in the future.
  • WE.3.1.5: No support needed at this point. They want to skip the use of orphans for the first round of experiments.
Jul 12 2024, 12:58 PM · OKR-Work, Research (FY2024-25-Research-July-September)
MGerlach added a comment to T369288: [WE.3.1.3] Building a model for content simplification (Q1).

Weekly update:

  • Systematizing open questions for successful model development
    • Infrastructure requirements: size, performance (inference time)
    • Product requirements: Languages, Quality, Use-case
    • Model requirements: Openness (can we host), Effectiveness (does the model have a chance to perform well), training (supervised, in-context learning, zero-shot as is), evaluation
  • Gathering input from product team (Web) on intended use to tailor model specifications (languages, input/output format, quality boundaries). These are ongoing discussions but should become more clear in the next week or so
  • Learning about updates in ML infrastructure which expands the potential set of candidate models. The ML team announced that we will have new servers will GPUs available for model training and hosting. This will allow us to use larger (and potentially better) models. Similarly, I am following the test-deployment of the Gemma2-model on LiftWing T369055. If successful, this could constitute a promising candidate model for the simplification task (or models with similar size/architecture).
  • Reviewing scientific literature to compile list of candidate models for the task. I have identified more than 10 recent papers (2023/2024) on using LLMs for text simplification approach. This is very insightful to understand which models are most promising for the specific task. For example, a promising model is Aya-23, an open model with dedicatedly multilingual support and, in principle, compatible with out infrastructure constraints (see above)..
Jul 12 2024, 12:55 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Jul 10 2024

MGerlach created T369712: Request to update Readability model on Lift Wing.
Jul 10 2024, 12:54 PM · Lift-Wing, Machine-Learning-Team

Jul 5 2024

MGerlach added a comment to T361942: Run revised 3rd survey on readability perception.

weekly update:

  • ran the first part of the survey this week. next week will be the second part.
Jul 5 2024, 1:39 PM · Research (FY2024-25-Research-July-September)
MGerlach updated the task description for T361942: Run revised 3rd survey on readability perception.
Jul 5 2024, 1:38 PM · Research (FY2024-25-Research-July-September)

Jul 4 2024

MGerlach added a subtask for T342614: Models for text summarization using LLMs: T369288: [WE.3.1.3] Building a model for content simplification (Q1).
Jul 4 2024, 1:55 PM · Epic, address-knowledge-gaps, Research
MGerlach added a parent task for T369288: [WE.3.1.3] Building a model for content simplification (Q1): T342614: Models for text summarization using LLMs.
Jul 4 2024, 1:55 PM · OKR-Work, Research (FY2024-25-Research-July-September)
MGerlach created T369292: Research support for hypothesis in KR WE.3.1 (Q1).
Jul 4 2024, 1:52 PM · OKR-Work, Research (FY2024-25-Research-July-September)
MGerlach renamed T369288: [WE.3.1.3] Building a model for content simplification (Q1) from [WE.3.1.3] Building a model for content simplification to [WE.3.1.3] Building a model for content simplification (Q1).
Jul 4 2024, 1:48 PM · OKR-Work, Research (FY2024-25-Research-July-September)
MGerlach created T369288: [WE.3.1.3] Building a model for content simplification (Q1).
Jul 4 2024, 1:48 PM · OKR-Work, Research (FY2024-25-Research-July-September)

Jul 1 2024

MGerlach added a comment to T357692: OpenReview for WikiWorkshop.

@MGerlach if you have any other observations, please add them to this task. We'll review this as part of the retro and it's good to have as we prepare for next year and know what we need to address in advance. I'll resolve the task after you're done :)

@KinneretG A gree with everything you wrote. In general though I did like OpenReview. One additional comment:

  • setting of deadlines was public (e.g. for submission deadline of reviews for PC members). We often set the deadline in OpenReview to some time (e.g. a few hours or a day) after the officially communicated deadline in order to accommodate also those who run into technical or other issues. However, the internally set deadline is also visible publicly for the relevant folks. This lead to some confusion as to what the actual deadline is.
Jul 1 2024, 12:18 PM · Research
MGerlach added a comment to T361926: Improve training and inference pipeline for multilingual link recommendation model.

Moving this to the next quarter (FY2024-25-Research-July-September) as the work is not yet fully completed

  • the MR for the pipeline in airflow is submitted.
  • this needs some code-review by research-engineering before it can be merged. this should be completed by mid-July (see T361929#9935541)
  • we expect to resolve the task in the next 1-2 weeks
Jul 1 2024, 10:13 AM · Research, Essential-Work
MGerlach moved T361926: Improve training and inference pipeline for multilingual link recommendation model from FY2023-24-Research-April-June to FY2024-25-Research-July-September on the Research board.
Jul 1 2024, 10:08 AM · Research, Essential-Work
MGerlach closed T355729: Organize the Research track - Wiki Workshop 2024 as Resolved.

We succesfully ran the research track at wikiworkshop2024.
Closing this task as all sub-tasks have been completed.

Jul 1 2024, 9:50 AM · Research
MGerlach closed T355729: Organize the Research track - Wiki Workshop 2024, a subtask of T356020: Wiki Workshop 2024 - Q3, as Resolved.
Jul 1 2024, 9:49 AM · Research (FY2023-24-Research-January-March)
MGerlach closed T355729: Organize the Research track - Wiki Workshop 2024, a subtask of T340599: Organize Wiki Workshop 2024 (June 20th), as Resolved.
Jul 1 2024, 9:49 AM · Research (FY2024-25-Research-July-September)
MGerlach added a comment to T361942: Run revised 3rd survey on readability perception.

Moving to FY204-25-Research-July-September for now as we need 1 more week to run actually run the survey.

Jul 1 2024, 9:47 AM · Research (FY2024-25-Research-July-September)
MGerlach moved T361942: Run revised 3rd survey on readability perception from FY2023-24-Research-April-June to FY2024-25-Research-July-September on the Research board.
Jul 1 2024, 9:45 AM · Research (FY2024-25-Research-July-September)

Jun 28 2024

MGerlach added a comment to T361942: Run revised 3rd survey on readability perception.

weekly update:

  • finished internal testing
  • implemented two changes based on feedback: i) make the survey shorter by removing a few pairs, ii) add a question about how regularly participants use Wikipedia.
  • figuring out available resources on prolific
  • planning to deploy next week
Jun 28 2024, 12:45 PM · Research (FY2024-25-Research-July-September)
MGerlach closed T361944: Orphan articles as reading recommendations as Resolved.

weekly updates:

  • finally I managed to spend some time on this.
  • I figured out that one of the main bottlenecks was to calculate the indegree for each potential link-target (this is crucial since we want to prioritize articles with low indegree such as orphans). My initial approach was to use the Linkshere-API. However, this requires a separate call for each individual article. A much cheaper alternative is to query the replicas, with only a single query for potentially hundreds of articles for which we want to get the indegree (see the example in quarry). The replicas can be easily queried from toolforge (wikitech-documentation, example script in PAWS). For an example article, I could reduce the query-time 10-fold.
  • I integrated this (and a few other fixes to improve the recommendations) into the latest version of the tool. Example: https://linkrec.toolforge.org/readmore?lang=en&title=Tiwanaku
  • the example still takes some time but its much less likely to just timeout since the number of API calls is much smaller.
  • in case, we need much more substantial speedups, we might want to resort to other heuristics such as the one mentioned above (T361944#9809372) using morelikethis in cirrussearch combined with the orphans-template
Jun 28 2024, 8:49 AM · Research (FY2023-24-Research-April-June)
MGerlach closed T361947: Evaluate simplification in multilingual setting as Resolved.

weekly update:

Jun 28 2024, 8:33 AM · Research (FY2023-24-Research-April-June)
MGerlach closed T361947: Evaluate simplification in multilingual setting, a subtask of T342614: Models for text summarization using LLMs, as Resolved.
Jun 28 2024, 8:32 AM · Epic, address-knowledge-gaps, Research

Jun 21 2024

MGerlach added a comment to T361947: Evaluate simplification in multilingual setting.

weekly updates:

  • Ran and evaluated experiments with fine-tuned Flan-T5 on other languages. The main problem is that if the model is only trained on article pairs (original/simplified) in English, the model output (simplified) will most often be in English, even if the model input (original) is in, say, German or French. Thus, the model will perform a simplification AND translation in to English since it has seen only English samples of simplfiied text.
  • As a potential solution I am experimenting with the following setup: i) fine-tune the model with additional samples from the other languages, ii) add a prefix "Simplify in <LANG>: " to the model-input. This will provide additional instructions for the model treating the simplifications in different languages as different but related tasks.
Jun 21 2024, 4:32 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T361942: Run revised 3rd survey on readability perception.

weekly update:

  • internal testing of the limesurvey (including integration with prolific)
  • next week: deployment on prolific
Jun 21 2024, 2:53 PM · Research (FY2024-25-Research-July-September)
MGerlach added a comment to T349774: Maintain wikiworkshop.org website.

@MGerlach done.

Jun 21 2024, 1:27 PM · Research, Patch-For-Review
MGerlach added a comment to T349774: Maintain wikiworkshop.org website.

@DDeSouza could you please remove the pdf for the paper (the sooner the better so it doesnt get indexed by GoogleScholar etc):
"Structural Evolution of Co-Creation in Wikipedia [PDF]
Negin Maddah and Babak Heydari"
They opted out of having the pdf put on the website.
Thank you

Jun 21 2024, 8:11 AM · Research, Patch-For-Review

Jun 14 2024

MGerlach added a comment to T361944: Orphan articles as reading recommendations .

weekly update:

  • no updates
Jun 14 2024, 5:43 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T361942: Run revised 3rd survey on readability perception.

weekly update:

  • finalized the Limesurvey
  • will test the Limesurvey internally next week (after that can be deployed)
Jun 14 2024, 5:42 PM · Research (FY2024-25-Research-July-September)
MGerlach added a comment to T361947: Evaluate simplification in multilingual setting.

weekly update:

  • started setting up the experiments for evaluating fine-tuned Flan-T5 on other languages
  • will run experiments next week
Jun 14 2024, 5:41 PM · Research (FY2023-24-Research-April-June)
MGerlach closed T352545: Organize sessions for research track as Resolved.

weekly update:

Jun 14 2024, 5:41 PM · Research (FY2023-24-Research-April-June), Research-outreach
MGerlach closed T352545: Organize sessions for research track, a subtask of T355729: Organize the Research track - Wiki Workshop 2024, as Resolved.
Jun 14 2024, 5:41 PM · Research
MGerlach updated the task description for T352545: Organize sessions for research track.
Jun 14 2024, 5:38 PM · Research (FY2023-24-Research-April-June), Research-outreach

Jun 10 2024

MGerlach closed T362416: Attend ICWSM 2024 conference as Resolved.
Jun 10 2024, 2:34 PM · Research-foundational, address-knowledge-gaps, Research-outreach, Research
MGerlach updated the task description for T362416: Attend ICWSM 2024 conference.
Jun 10 2024, 2:34 PM · Research-foundational, address-knowledge-gaps, Research-outreach, Research

May 30 2024

MGerlach added a comment to T361944: Orphan articles as reading recommendations .

weekly update:

  • no update. main focus was preparing for attending ICWSM T362416
May 30 2024, 4:45 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T361942: Run revised 3rd survey on readability perception.

weekly update:

  • no update. main focus was preparing for attending ICWSM T362416
May 30 2024, 4:44 PM · Research (FY2024-25-Research-July-September)
MGerlach added a comment to T361947: Evaluate simplification in multilingual setting.

weekly update:

  • no update. main focus was preparing for attending ICWSM T362416
May 30 2024, 4:44 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T352545: Organize sessions for research track.

weekly update:

  • iterating on the sessions and the session chairs
  • planning to finalize next week
May 30 2024, 4:42 PM · Research (FY2023-24-Research-April-June), Research-outreach

May 27 2024

MGerlach closed T352543: Review and select submissions for research track as Resolved.

FInal update:

  • send notifications to authors
  • website will be updated with accepted extended abstracts and the members of the program committee
  • also send thank-you email to reviewers and inviting them to register to attend the event
May 27 2024, 9:32 AM · Research (FY2023-24-Research-April-June), Research-outreach
MGerlach closed T352543: Review and select submissions for research track, a subtask of T355729: Organize the Research track - Wiki Workshop 2024, as Resolved.
May 27 2024, 9:31 AM · Research
MGerlach updated the task description for T352543: Review and select submissions for research track.
May 27 2024, 9:30 AM · Research (FY2023-24-Research-April-June), Research-outreach

May 24 2024

MGerlach added a comment to T361944: Orphan articles as reading recommendations .

weekly update:

  • no update since my main focus was on Wiki Workshop T352543
May 24 2024, 3:55 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T361947: Evaluate simplification in multilingual setting.

weekly update:

  • no update since my main focus was on Wiki Workshop T352543
May 24 2024, 3:55 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T361942: Run revised 3rd survey on readability perception.

weekly update:

  • working on finalizing the limesurvey with manually verified samples
May 24 2024, 3:53 PM · Research (FY2024-25-Research-July-September)
MGerlach added a comment to T352545: Organize sessions for research track.

weekly update:

  • finalized session format and alignment with workshop schedule
  • created first draft of organizing papers into sessions
  • created shortlist of session chairs
May 24 2024, 3:51 PM · Research (FY2023-24-Research-April-June), Research-outreach
MGerlach added a comment to T352543: Review and select submissions for research track.

weekly udpate:

  • made decisions about all submissions
  • will be sending out notifications to authors next week (May 27)
May 24 2024, 3:50 PM · Research (FY2023-24-Research-April-June), Research-outreach
MGerlach updated the task description for T352543: Review and select submissions for research track.
May 24 2024, 3:50 PM · Research (FY2023-24-Research-April-June), Research-outreach

May 17 2024

MGerlach added a comment to T361944: Orphan articles as reading recommendations .

weekly update:

  • investigated how we could take advantage of existing recommendations from the RelatedArticles feature. the feature obtains recommendations from queries to cirrussearch' morelike. There seem to be two interesting options from the options available in the API
    • filtering or boosting with the number of inlinks (i.e. prioritizing articles with low indegree), e.g., via Srsort:incoming_links_asc
    • using the m̀orelikethis`option which allows to specify specific templates. In our case, we could then require recommendations to have the Orphans-template. The query then yields related articles that are marked as orphans via the corresponding template. Example query: https://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=morelikethis:Tiwanaku%20hastemplate:%22orphan%22&srlimit=10
    • especially, the latter approach seems promising (as it could also be adapted to other languages) but would require some evaluation of the quality of the recommendations. while the recommendations are orphans, it is not clear how "well" they are related to the query-article. however, in contrast to link translation, the queries via morelike/morelikethis are much faster.
May 17 2024, 4:20 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T361942: Run revised 3rd survey on readability perception.

weekly update:

  • finishing manual validation of the selected snippets that will be shown to participants
May 17 2024, 4:07 PM · Research (FY2024-25-Research-July-September)
MGerlach added a comment to T361947: Evaluate simplification in multilingual setting.

weekly update:

  • no update. ready to pick up next week since T354653 was completed
May 17 2024, 4:05 PM · Research (FY2023-24-Research-April-June)
MGerlach closed T354653: Work on model optimization and scaling as Resolved.

weekly update

May 17 2024, 4:05 PM · Research (FY2023-24-Research-April-June)
MGerlach closed T354653: Work on model optimization and scaling, a subtask of T342614: Models for text summarization using LLMs, as Resolved.
May 17 2024, 4:04 PM · Epic, address-knowledge-gaps, Research
MGerlach added a comment to T352545: Organize sessions for research track.

weekly update:

  • Started to plan format and schedule and candidates for sessions
  • Started to group papers into themed sessions
May 17 2024, 4:04 PM · Research (FY2023-24-Research-April-June), Research-outreach
MGerlach updated the task description for T352545: Organize sessions for research track.
May 17 2024, 4:02 PM · Research (FY2023-24-Research-April-June), Research-outreach
MGerlach added a comment to T352543: Review and select submissions for research track.

weekly update

  • Sending reminders to reviewers
  • Made sure we have 3 emergency reviewers available for short-notice reviews during next week
  • Drafted decision emails
  • Preparing decision stage (to be completed next week)
May 17 2024, 4:02 PM · Research (FY2023-24-Research-April-June), Research-outreach
MGerlach updated the task description for T352543: Review and select submissions for research track.
May 17 2024, 4:00 PM · Research (FY2023-24-Research-April-June), Research-outreach

May 15 2024

MGerlach added a comment to T362957: Create a dataset for training/evaluating models for summarizing (long) discussions.

@Htriedman this is super useful. thanks for looking into this and sharing your results.
For future reference: The above code looks for all pages that use the template {{Closed_rfc_top}}. This template is used to close a Request for comment (RfC) on a talk page or a noticeboard. Thus, we can easily identify pages that contain relevant (i.e. closed) discussions. The above code then goes a lot further by parsing the content of the page to extract the discussion in a tree-like structure (capturing comments etc).

May 15 2024, 8:18 AM · Research-Freezer, Wikimedia-Hackathon-2024

May 8 2024

MGerlach added a comment to T361942: Run revised 3rd survey on readability perception.

weekly update:

  • finalized design of the Limesurvey
  • next step: add selected snippets to survey
May 8 2024, 4:56 PM · Research (FY2024-25-Research-July-September)
MGerlach created T364486: Update the Research Handbook to clarify how to handle news coverage of our research.
May 8 2024, 4:23 PM · Research-management, Research-outreach, Research
MGerlach closed T362751: [Session] ML/AI models on LiftWing as Resolved.

Ran session at Wikimedia Hackathon (see program)

May 8 2024, 9:22 AM · Wikimedia-Hackathon-2024

May 7 2024

MGerlach placed T362957: Create a dataset for training/evaluating models for summarizing (long) discussions up for grabs.

Unassigning myself as I am not planning to work on this in the near future (next 3 months)

May 7 2024, 3:12 PM · Research-Freezer, Wikimedia-Hackathon-2024

May 6 2024

MGerlach closed T362419: Attend Wikimedia Hackathon 2024 as Resolved.

I ran a session on ML models developed by the Research Team that are hosted in LiftWing (T362751). There was a lot of interest with ~20 attendees and a good discussion. Several developers were interested in exploring how to use the models in their projects. For example, folks from Maroccon Arabic Wikipedia were interested to use the revert-risk model for a bot for vandalism detection. There was also questions about how to use the models for third-party wikis. More generally, there was an interest in using larger prompt-based LLMs in different applications.

May 6 2024, 8:55 AM · Essential-Work, address-knowledge-gaps, Research-foundational, Research-outreach, Research

May 5 2024

MGerlach added a comment to T362957: Create a dataset for training/evaluating models for summarizing (long) discussions.

I didnt get to work a lot on this during the hackathon as I was mostly focusing on T362805.

May 5 2024, 8:39 AM · Research-Freezer, Wikimedia-Hackathon-2024
MGerlach added a comment to T362805: Build a tool (or tools) to easily visualize differentially-private datasets.

set up an API endpoint on cloud-vps as a backend from where we can query the dataset for individual pages.

May 5 2024, 8:05 AM · Technical-Tool-Request, Wikimedia-Hackathon-2024

May 3 2024

MGerlach added a comment to T352543: Review and select submissions for research track.

weekly update:

  • finalized assignment of reviewers to papers. mainly relied on bids from reviewers (90% entered their bids) with some manual adjustments
  • sent review assignments to reviewers (deadline is May 20)
May 3 2024, 1:47 PM · Research (FY2023-24-Research-April-June), Research-outreach
MGerlach updated the task description for T352543: Review and select submissions for research track.
May 3 2024, 1:45 PM · Research (FY2023-24-Research-April-June), Research-outreach

Apr 26 2024

MGerlach added a comment to T362419: Attend Wikimedia Hackathon 2024.

update:

  • scheduled the session on ML models in the official program
  • started preparing the content of the sessions (which slides and what I could demo)
Apr 26 2024, 4:24 PM · Essential-Work, address-knowledge-gaps, Research-foundational, Research-outreach, Research
MGerlach updated the task description for T362419: Attend Wikimedia Hackathon 2024.
Apr 26 2024, 4:23 PM · Essential-Work, address-knowledge-gaps, Research-foundational, Research-outreach, Research
MGerlach added a comment to T361947: Evaluate simplification in multilingual setting.

weekly update:

Apr 26 2024, 4:21 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T354653: Work on model optimization and scaling.

weekly updates:

  • progressing on the write-up of the results. I spent some time to collect some representative examples for complementing the results with some qualitative evaluation. Looking at the output of different models in more detail, it seems there are strong differences that are not well captured by the automatic evaluation metrics (e.g. SARI)
Apr 26 2024, 4:20 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T361944: Orphan articles as reading recommendations .

weekly update:

  • no update this week because I didnt manage to free up time for this task (shorter week, annual planning deadlines, wiki workshop submission deadline)
Apr 26 2024, 4:17 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T361942: Run revised 3rd survey on readability perception.

weekly update:

  • prepared and submitted an extended abstract to wikiworkshop with preliminary results. this is not directly related to the survey but will give us additional feedback for improving later versions of this survey.
  • Decided on final sampling of snippets and design details (buttons etc)
  • next step: implement changes into the Limesurvey
Apr 26 2024, 4:15 PM · Research (FY2024-25-Research-July-September)
MGerlach updated the task description for T361942: Run revised 3rd survey on readability perception.
Apr 26 2024, 4:13 PM · Research (FY2024-25-Research-July-September)
MGerlach added a comment to T352543: Review and select submissions for research track.

weekly update:

  • submission deadline passed: we received 50 submissions
  • reviewer bidding is ongoing until end of this week
  • next step: finalizing review assignment early next week
Apr 26 2024, 3:18 PM · Research (FY2023-24-Research-April-June), Research-outreach

Apr 25 2024

MGerlach moved T362751: [Session] ML/AI models on LiftWing from Proposed sessions to Scheduled Sessions on the Wikimedia-Hackathon-2024 board.
Apr 25 2024, 5:02 PM · Wikimedia-Hackathon-2024

Apr 24 2024

MGerlach added a comment to T349774: Maintain wikiworkshop.org website.

Request for a small (but important) change:

Thanks

Apr 24 2024, 7:59 AM · Research, Patch-For-Review

Apr 19 2024

MGerlach added a comment to T361944: Orphan articles as reading recommendations .

weekly update:

  • no update this week
Apr 19 2024, 4:14 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T361942: Run revised 3rd survey on readability perception.

weekly update:

  • extracted 100 suitable pairs of original/simplified snippets for comparison (sheet)
  • from this we will select two groups of each 10 pairs for the pilot: i) "treatment group": the two version of the pair are very different in their automatic readability score; ii) "control group": the two versions of the pair are very similar in their automatic readability score.
Apr 19 2024, 4:13 PM · Research (FY2024-25-Research-July-September)
MGerlach added a comment to T361947: Evaluate simplification in multilingual setting.

weekly update:

  • no update (still working on T354653)
Apr 19 2024, 4:08 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T354653: Work on model optimization and scaling.

weekly update:

  • finished experiments with Flan-T5 model in English
  • writing up results for the meta-page (incomplete draft as doc)
Apr 19 2024, 4:07 PM · Research (FY2023-24-Research-April-June)
MGerlach added a comment to T352543: Review and select submissions for research track.

weekly update:

  • checked the review form
  • testing the bidding interface for reviewers and prepared upcoming comms with PC members
  • monitoring submissions (deadline next week) and answering questions by authors
Apr 19 2024, 4:05 PM · Research (FY2023-24-Research-April-June), Research-outreach
MGerlach updated subscribers of T362419: Attend Wikimedia Hackathon 2024.
Apr 19 2024, 8:51 AM · Essential-Work, address-knowledge-gaps, Research-foundational, Research-outreach, Research
MGerlach updated the task description for T362419: Attend Wikimedia Hackathon 2024.
Apr 19 2024, 8:48 AM · Essential-Work, address-knowledge-gaps, Research-foundational, Research-outreach, Research