Reputation!

Congratulations on the speedy implementation of reputation in the alpha.  The secret sauce has been poured!

I'll probably have more thoughts as the mechanism of Rep unfolds (not by disclosure but by observation - I understand that the Team won't reveal any secrets, and rightly so).

But here are my initial thoughts:

Puny reps

We are but children starting our journey on Narrative, and it is fitting that our reps should be low at this stage!  Relish the sensation of being puny, and look forward to the journey of growth ahead!

Outliers

Whilst I don't want to jump to conclusions on how the algorithms stand, I will share a suspicion that they don't yet take into account outliers.

Here's an example.  Imagine Tony. 

Tony isn't your average user.  He isn't good at content. 

He just isn't. 

Let's blame poor spelling and grammar, hurdles he has struggled with unsuccessfully all his life.  Let's blame a lack of sensitivity to story structure and rhythm, coupled with a complete lack of humor and a very regrettable penchant for abusing alliterations.  All.  Over.  The shop. 

Most people perceive his pieces as peerlessly pathetic, pitiable portions of pumpkin pie (and not from palatable pumpkins - think putrid pumpkins).

But here's the thing.  Tony is the most committed voter on the platform.  He takes his civic duty very seriously, and his voting record, both in number AND quality of votes, is stellar.  He performs about 15% higher than the person in second place.

How do we value Tony? 

If an algorithm hard caps how many points a person can earn for any given category of reputation without considering the role of outliers in that category, Narrative as a whole will lose out.

If the maximum points someone can earn for voting, for instance, is 20, the fact that Tony is a full 15% more valuable to the network than the second most diligent voter - and perhaps 4 times more valuable than the average voter - is completely lost in the equation. 

Lets imagine that a further 40 points can be earned for content creation and quality: and Tony scores a 2 out of 40.  Betty on the other hand makes fairly good content, fairly frequently, and scores a 23 out of 40.  She is an average voter and scores 10.

Results so far, barring other aspects of reputation?  

Betty 33/60

Tony 22/60

These results could dangerously underestimate the value of outliers.  In an election for the Tribunal, where diligence matters most, for instance... Betty might seem to be the better candidate, despite her forte being in content creation, and not in civic duties.

Also, crucially, a system functioning in this way will not encourage people to play to their strengths.  Tony would see his reputation breakdown, and once he has climbed out of his deep, soul-wrecking depression over it, might decide to cut his voting efforts in half, instead diverting half his time towards meager improvements to his writing.  This shift drops his voting performance to good, but not stellar levels (score dropping from 20 to 16), while his content score jumps from 2 to 9. 

Tony smiles: he now has 25/60 (up 3 from his previous score).

But Narrative should be frowning.  Is upgrading terrible content to merely mediocre actually worth ANY loss in Tony's strong suit: voting more than four average users combined, and consistently delivering better decisions?

The network was better off with him as an outlier, doing what he does best.

We wouldn't hire Einstein to be a cook, right?  And we'd laugh at the notion of a studio asking James Cameron to split his energies between directing, and working in the studio's legal department.

The mathematical implications of this are simple and powerful: we must allow people to be specialists, with a higher ceiling of points achievable within their own strong points.

Exactly how this is achieved in the algorithm is none of my business.  Precisely which balance is sought by the architects of this system is entirely their prerogative, but it is undeniable that we need to seek to achieve that balance.

I can say that the tweaking of this balance will best be achieved by systematically graphing outlier data, and determining how much Narrative relies on a smaller group of over performers to achieve excellence in its various metrics.

The more a metric depends on a smaller group of outliers to achieve critical quality, the more that metric must reward the outliers... within reason.  The relationship would be proportional, but not directly proportional.  And depending on how strongly we want to dampen tendencies towards over-specialization, the algorithms can shave a bit off this influence as a final adjustment in this code function, or another function can work to offset its effects somewhat.

Conclusion

The consequence of not modelling for this is very predictable.  Overall network quality loss.  Our specialists will divert too much energy away from what they are best at, and all areas of Narrative will lose of their edge, while making much smaller gains on the bottom end, compared to what has been lost on the top.

Original Post

I know the secret sauce recipe must be protected from the forces of evil, but can any light be shone on the vote points?

Great to see reputation seems to be affecting the Niche Approval votes, but the vote point amounts don't follow any discernible pattern, from a user's perspective, and it would be helpful for people to at least know what the rationale behind them is. 

For instance a vague description such as "The higher a person's Simple Rep, the higher the voting points their vote carries".  Which would make sense.  And the Medium article Ted published about reputation seemed to describe this, in a table about vote tallying.  But we're not seeing that play out in the vote points displayed next to each person's vote?

Person A with a Simple Rep of 1 seems to be delivering 1.0 vote point per vote.

Person B with a Simple Rep of 4 seems to be delivering 0.01 vote points per vote.

( Person A has an overall rep of 5 and person B has an overall rep of 9 ).

Is there a glitch?

 EDIT: @Bryan suggested that maybe legacy votes are given 1.0 voting points, and only votes since the rep launch get the voting points applied to them.  This makes sense, but it is problematic.  Those people who voted on a niche right before the rep launch will have essentially about 50 times the influence on the outcome, compared to those who voted after the rep launch on the same niche.  That's not a small mismatch, and I think this needs to be addressed.  The easiest way is to run a script on all niches actively under approval, to set all votes to the weight warranted by the current rep of the voter.  So none of the current niches under approval should have votes with 1.0 voting points.  @Narrative Network Team?  What sayest thou?

There is also a confusing passage in the following Medium post about Reputation published by @Ted

https://blog.narrative.network...rrative-e43c2b0e9fd2

If your SimpleRep score is less than 50, that means that more than half of your SimpleRep points are negative.

 

Molly has a Simplerep score of 18 right now.  I can't bring myself to imagine that more than half her SimpleRep points are negative, so the above statement does not appear to be true?

I also have not been able to find anyone with a SimpleRep of 50 or more (not even 30), which means that every single user currently has more negative points than positive?

I understand and agree that everyone should have low reputation at this early stage, but according to the description of positive and negative points above, something seems to be wrong.

EDIT: all the posts above are for the attention of @Narrative Network Team and @Brian Lenz and @Ted - not sure who is more specifically in charge of reputation, so tagging you all!

Like, I wouldn't want to see me having more rep than someone else just because I owned some niches - so... But if I do - that still doesn't seem quite right. Now, with time, my contributions to those niches would obviously add or subtract, but isn't that like "buying" reputation if you get reputation simply for the act of having a niche? 

chrisabdey posted:

Like, I wouldn't want to see me having more rep than someone else just because I owned some niches - so... But if I do - that still doesn't seem quite right. Now, with time, my contributions to those niches would obviously add or subtract, but isn't that like "buying" reputation if you get reputation simply for the act of having a niche? 

I'm pretty sure I can reassure you on that front.  The word seems to be that at this time, only voting and suggesting niches affects reputation.

And I don't think they have any plans for owning a niche to automatically give you extra rep.

Everyone starts at 0.01.. a score according to their rep (which is admittedly low due to incomplete scoring functionality) which can be found here:

see: Rating and Voting Formula table

or, rather, their voting score is reset to 0.01 for any activity that happens AFTER the launch of reputation. Before that, everyone was innocent until proven guilty with a 1.0 score.

So, while the low rep scores were explained as "not having complete data" to give a full score cushioning the appearance of a metaphorical big red F on your report card, the voting scores/influence points were not explained and are causing problems with voting results as well as user confidence

Christina Gleason posted:

I've voted on practically everything. I have a rep of 5, with 1 for simple rep, 13 for QA, and 0 for certified. (Are we all 0 for certified because there's no KYC yet?) This is disheartening, because I've tried to be active by voting and leaving comments.

I can attest to seeing you vote and comment A LOT.  I feel like you have been the most frequent voter in recent times: you're everywhere on niche approvals!  Maybe the @Narrative Network Team can look into it...

Yes, everyone has 0 for certified.  The functionality for getting certified has not been rolled out yet.

Bryan posted:

Everyone starts at 0.01.... or, rather, their voting score is reset to 0.01 for any activity that happens AFTER the launch of reputation. Before that, everyone was innocent until proven guilty with a 1.0 score.

So, while the low rep scores were explained as "not having complete data" to give a full score cushioning the appearance of a metaphorical big red F on your report card, the voting scores/influence points were not explained and are causing problems with voting results. 

Forgive me if this has already been explained elsewhere... I'm just diving in here on this conversation. I'll edit/update this comment if I'm off/late. 

Now that reputation is live, the impact of one's vote is based on their reputation. It's a graduated scale based on your total reputation score. The exact breakdown can be found in this table in the spec:

https://spec.narrative.org/doc...g-and-voting-formula

It's expected that everyone has low-ish reputation at this point since we don't have the ability to post content, rate content, or comment on posts. Once we get to that point, there will be many other factors driving your overall reputation score.

Christina Gleason posted:

Are we all 0 for certified because there's no KYC yet?

Yep, exactly! Certification is coming in February 2019. There will be a nominal fee to get certified via a built-in KYC process.

Malkazoid posted:

BTW, the number of votes are no longer listed on the niche approval page.

Is the requirement still 20 votes before a niche can be approved?  If so, it would be good to bring back that display.

That's correct; 20 votes is still the minimum. Thanks for the suggestion...we'll toss your idea around 

Malkazoid posted:

There is also a confusing passage in the following Medium post about Reputation published by @Ted

https://blog.narrative.network...rrative-e43c2b0e9fd2

If your SimpleRep score is less than 50, that means that more than half of your SimpleRep points are negative.

 

Molly has a Simplerep score of 18 right now.  I can't bring myself to imagine that more than half her SimpleRep points are negative, so the above statement does not appear to be true?

I also have not been able to find anyone with a SimpleRep of 50 or more (not even 30), which means that every single user currently has more negative points than positive?

I understand and agree that everyone should have low reputation at this early stage, but according to the description of positive and negative points above, something seems to be wrong.

Valid points. We're working through a discrepancy here between the wording and the actual reputation formula. More to come here

Malkazoid posted:

 EDIT: @Bryan suggested that maybe legacy votes are given 1.0 voting points, and only votes since the rep launch get the voting points applied to them.  This makes sense, but it is problematic.  Those people who voted on a niche right before the rep launch will have essentially about 50 times the influence on the outcome, compared to those who voted after the rep launch on the same niche.  That's not a small mismatch, and I think this needs to be addressed.  The easiest way is to run a script on all niches actively under approval, to set all votes to the weight warranted by the current rep of the voter.  So none of the current niches under approval should have votes with 1.0 voting points.  @Narrative Network Team?  What sayest thou?

This is the designed behavior. The system didn't have reputation voting prior to the reputation release, so the original vote values of 1.0 remain. This scenario of vote weight differences will only apply to this unique situation of approvals that have spanned the release, and won't be an issue going forward.

Thanks for sharing all of your thoughts and concerns! We really appreciate all of the valuable feedback. We're excited to have the reputation system in the wild, and I'm sure we'll make iterative improvements to it over time.

Christina, I also felt very unjustly reduced. But, I think the short answer is... don't worry about the low scores right now. They are disheartening, yes. But, they are merely a function of new things being introduced + date confused by time and sequence + resetting all the runners to the starting line... 

Keep up the good work, it will iron out. 

Brian Lenz posted:

https://spec.narrative.org/doc...g-and-voting-formula

 

Malkazoid posted:

 EDIT: @Bryan suggested that maybe legacy votes are given 1.0 voting points, and only votes since the rep launch get the voting points applied to them.  This makes sense, but it is problematic.  Those people who voted on a niche right before the rep launch will have essentially about 50 times the influence on the outcome, compared to those who voted after the rep launch on the same niche.  That's not a small mismatch, and I think this needs to be addressed.  The easiest way is to run a script on all niches actively under approval, to set all votes to the weight warranted by the current rep of the voter.  So none of the current niches under approval should have votes with 1.0 voting points.  @Narrative Network Team?  What sayest thou?

This is the designed behavior. The system didn't have reputation voting prior to the reputation release, so the original vote values of 1.0 remain. This scenario of vote weight differences will only apply to this unique situation of approvals that have spanned the release, and won't be an issue going forward.

Thanks for sharing all of your thoughts and concerns! We really appreciate all of the valuable feedback. We're excited to have the reputation system in the wild, and I'm sure we'll make iterative improvements to it over time.

Thanks Brian!

The voting formula table is very helpful. I think everyone can handle an interim "low" period with some hiccups... but I think how that "designed behavior" was (or was not) communicated caught many by surprise. Self-included.

The existing ballots will probably need very careful scrutiny by Tribunal to make sure that 1 pre-rep vote (worth 1.0) doesn't override 20 new, artificially-low, rep-adjusted votes (only worth 0.2). 

All told, worries have been assuaged. Looking forward to growing my rep! 

Erik Blair posted:

I apparently have a 4 reputation despite my early adoption, many comments, likes, niche suggestions, niche voting, owning niches, etc. 
So, without reading this whole page, I already feel disenchanted with the reputation system as it currently is. 

I'd encourage you not to take these scores personally yet. I was, and it wrecked my morning... I am not very gracious, and tend to jump to worst-case scenarios. So, I'm saying to myself NOW what I wish I had heard 5 hours ago when I saw the new numbers. 

Those reputation profiles are incomplete. They will balance out. #hope

Brian Lenz posted:

This is the designed behavior. The system didn't have reputation voting prior to the reputation release, so the original vote values of 1.0 remain. This scenario of vote weight differences will only apply to this unique situation of approvals that have spanned the release, and won't be an issue going forward.

Thanks for sharing all of your thoughts and concerns! We really appreciate all of the valuable feedback. We're excited to have the reputation system in the wild, and I'm sure we'll make iterative improvements to it over time.

Hey Brian,

Thanks for this.

I'm still concerned about the current batch of niche approvals.  We have to be careful when we ask people to vote - our democracy is on trial every time a single election is not taken seriously enough by the platform.  Many of the results of this batch of niches under approval at the time of rep launch will essentially be invalid.  I don't think we can just live with that and say it is a one off.  It isn't.  There will be other glitches in voting in the future, no doubt.  I think they should be addressed every time it is possible to.

So could you reassure us that the Tribunal will field all the niches that got approved when they should not have, or rejected when they should have been approved?  And is the best plan to leave it up to users to do the math on each niche that is affected by this, to determine whether the result was unduly changed by the rep launch?  I'm doubtful that's the way to do it...  

Is there any downside to what I have proposed?  To run a script that:

1) identifies a list of niche approval votes that were active at the time or rep launch.

2) that scans each vote on those niche approvals, and if the vote was made before the launch rep timestamp, look up the reputation of the voter at launch rep time, and apply the corresponding voting points to that vote instead of the current 1.0

It doesn't sound difficult to do, but of course, maybe there are technical hurdles you face that I am not aware of that could hinder this sort of thing... I just can't think of what those could be.  All these values are just sitting in a database that can be altered via a script right? 

Essentially, all niche approvals should end up homogenised so they have only pre-launch type votes (with a weight of 1.0 each), or only post-launch type votes so they have a weight corresponding to the reputation of the voter.

If you have an extra moment, I'd appreciate hearing why this is not desirable, or not possible, or less efficient than trying to sift through the current approvals manually and determine which ones need to be appealed.  Of course if the Team is taking upon themselves to do this sifting, than community views on this are irrelevant.  But if the community is expected to do this sifting with a calculator in hand...  well...

 

@Brian Lenz - just realised there is an easier way (not sure why I didn't see it straight away).

If a niche was being voted on for approval when the rep launch happened, you could just ask your script to tally the votes the old way, ignoring the weighting entirely.  Then you have a list of fair results, which can be compared with the actual results - and when there is a mismatch, that's a niche the Tribunal has to act upon...

One other important consideration, @Brian Lenz: if we don't homogenize the voting point types live in the system, voting will be biased by the perception left by the votes that were in place prior to the rep system launch.

For example, lets say someone suggests a niche that is redundant.

He immediately votes to approve the niche he has just suggested.

Then the rep system launches, and his vote is given a weight of 1.0.

Five subsequent voters see that the niche is redundant and downvote it, but their 5 downvotes only amount to 0.05, vs the single 1.0 vote in favor.

This causes the blue bar to be 95% full, when it should be only 16,6% full, with 83.4% red.

This will heavily influence less cautious voters to simply approve the niche.

I really believe this needs fixing.

Which brings me to a suggestion.

Modes of making system changes live

I think the introduction of the rep system might show us, in a post mortem, that there is the need for several modes of updating the live Narrative ecosystem.  At its simplest, and modeled on the problem we're facing now, there could be two modes.

Brutal mode would do what seems to have been done here: immediate, direct introduction of the new variables and functions into the state of play.  This mode would be used after careful consideration of whether we can live with the results of the old and new attributes and functions existing side by side in the same niche approval.

Soft mode would instead only change the definition of classes, and those changes would only make their way into the system when the classes next get instantiated.  So a niche approval object constructed prior to the change would continue functioning as if nothing had changed, and would exhibit only the old-style votes, all equal in weight.  A niche approval object constructed after the change would start fresh with the new characteristics, and so would be homogeneously populated with votes weighted according to reputation.

Sorry if this is pedantic and annoying to read - I am not a professional developer, and that may be precisely why I don't quite understand why this update to the live system was done the way it was.  From the limited perspective of my experience level, it feels like this update needed a 'Soft mode', but instead used a 'Brutal mode'.

Hi @Malkazoid,

Yep, I get what you're saying. We made the call that this was a very low risk proposition. If your hypothetical scenario has occurred (which I don't believe it has), we will trust in the fact that niches can be appealed to the Tribunal. We have people on our team regularly paying attention to niches and identifying those that are redundant or invalid for any other reason. There's always the possibility that a niche sneaks through, and if that happens, it can be appealed.

As an alpha, this approach worked just fine due to the limited subset of niches that were affected by this transition scenario. We made the call this was acceptable in the interest of getting the reputation release out. Reputation wouldn't have been released Wednesday if we had decided your approach was necessary due to added implementation complexity.

At this point, it's really a moot argument since the system is in the wild, and we aren't going to go back and retroactively change approval vote outcomes.

Thanks as always for your passionate feedback 

Brian

Brian Lenz posted:

If your hypothetical scenario has occurred (which I don't believe it has), we will trust in the fact that niches can be appealed to the Tribunal. We have people on our team regularly paying attention to niches and identifying those that are redundant or invalid for any other reason. There's always the possibility that a niche sneaks through, and if that happens, it can be appealed.

 

Nice - this is what I was hoping for.  I don't know that the problem has occurred yet, but over the course of the next few days, many niches that got caught in the rep launch zone will end, and it is highly likely some of them will have the wrong results.  As long as you guys are aware some things are likely to need fixing by the Tribunal, and are actively scanning for them - we should be sweet.

And thanks for clarifying that it was a time factor that stood behind the decision to update in this way.  Here's to hoping future times allow for more optimal roll outs.

Add Reply

×
×
×
×