Optimize our Categorization of Praise

Hey everyone! :wave:

I wanted to point out this awesome component of our praise analysis that @liviade proposed and @zhiwei made into a reality - The categorization of praise! :orange_square: :small_red_triangle: :green_circle:

In case youā€™re unfamiliar the praise categorization looks something like this:

You can find the latest cross-period analysis here. (Check near the bottom for praise categorization)

https://rawcdn.githack.com/CommonsBuild/tec-rewards/79c9740ce0c5e71fc2103e51beeea317e9e1b9a8/distribution_rounds/round-19/distribution_results/reports/round-19--cross-period-analysis-report.html

While this was implemented a while ago I donā€™t think we ever took the time to check out how this works and how we can optimize it to give better results! :bar_chart:

How categorization works

We can define certain categories that represent a ā€œgroupingā€ of certain words. We can also define the words considered a part of each category. The cross-period analysis then finds the specified words in each instance of praise dished and puts it under itā€™s associated category.

From this process we can identify how often we dish praise under a certain category. We can also identify the average scores quantified for each category.

We can also see the 3 highest-scored praises across the specified period for each category. In the case above weā€™re looking at the previous 52 weeks. :calendar:

Current Categories and Keywords

We currently have 9 categories, here they are and their associated keywords:

attendance

  • join
  • attend
  • show up
  • participate

discussion

  • question
  • ask
  • discuss
  • discussion
  • conversation

work

  • help
  • work
  • design
  • make
  • write
  • hack
  • edit

lead

  • host
  • lead
  • initiate
  • form
  • organize
  • steward

share

  • share
  • spread

twitter

  • twitter
  • tweet

hack

  • hack
  • test

general

  • support
  • awesome

IRL

  • trip
  • conference

The purpose of this forum post is to review and refine our keywords and categories. Happy to receive any proposals and suggestions! :ballot_box:

We can also make progressive iterations by updating the keywords and running the analysis and reviewing the results. :mag:

The optimization of our categories and keywords will lead to eventually a quantification guide that will help quantifiers use their best judgement in the process by providing accurate historical scoring data.

3 Likes

I see the keyword hack in two categories (work and hack). I think most will understand, but it might be good to explicitly call out that itā€™s just a tip/guide to be helpful and might not apply and/or shouldnā€™t overrule your judgment as a quantifier.

For IRL, Iā€™ve also seen praise around taking mental breaks, rejuvenate, etc. Anything come to mind other than ā€˜breakā€™ which could be ambiguous?

I think another category for ā€œself-careā€ would be very useful to capture. We could look for keywords such as ā€œrejuvenateā€, ā€œmentalā€, ā€œvacationā€, ā€œtime offā€ (have to check if we can add phrases to the system)
ā€œbreakā€ I would avoid because apps or websites often break which could make things confusing

maybe itā€™s worth doing some of our research in the praise channel and see some historic wording

I also think removing the hack category would be useful since hacking has become often synonymous with working

2 Likes

@Maxwe11 and I hacked through a list of updated keywords and I just ran it into the latest cross-period analysis - check out the results!

https://rawcdn.githack.com/CommonsBuild/tec-rewards/c501ba7d9bce43c738b51c0d7f65d6d0913d793b/distribution_rounds/round-21/distribution_results/reports/round-21--cross-period-analysis-report.html

1 Like

Thanks for posting! That distribution looks reasonable, and while there are outliers they seem to be relatively few. We talked about reviewing the uncategorized praise to see if there were any trends we could identify, do we have that data? I saw a bulk number, which was higher than I expected at roughly 1/3 of all instances uncategorized.

It might be easier to review the uncategorised praise with the frequency analysis @enti set up earlier last month.

e.g. Feedback

Give it a spin!

https://eenti.github.io/TEC-Discord-Analysis/

1 Like

Are you able to use this tool to see the full message the word was contained in?

Itā€™s not exactly useful to see how often the word appears but rather if itā€™s being used in a context that reliably enough fits under a certain category we define.