English posts

als antwoord op github.com

I really like XRays way of storing things (flat, knowing which properties will be string or array, and the rels list), and I wanted to use it with my reader. Having it as a library with #17 would be nice for that, but parsing the whole h-feed at once is also very handy.

So, allow me to brainstorm / share some h-feed stuff I learned with sebsel/lees :)

I agree: the feed does not have to have an author. It can be used to define the author of the posts, but the feed itself can be authorless. Example of a mixed h-feed is Twitter > HTML through Granary.
Different ways of presenting:
1. have a children property with an array [] of objects {} that are just what you get in data when you look at a h-entry
2. have a children property with an array [] of urls, and put the h-entry objects in refs
Some h-feeds have no u-url for each h-entry. This makes it impossible to do 2. in those cases. But it might not be XRays task to fix that.
(Same for dt-published. A lot of WP sites have hentry class with sometimes a article-name class, but dropped the rest of the Mf1. Again: might not be XRays task to fix that.)
what to do with h-event and h-review within the h-feed? -> have to detect type for each child.
when looking at a home page (i.e. aaronparecki.com), I get a card, which is great because that's the main object on that page. But when using XRay in a /reader, I want a feed. Would be nice to have a type=feed parameter, to point XRay to the type of data you are looking for.
Once you go h-feed, should you go RSS/Atom?

Day 15: hacked my own site

Today I hacked my own site. I don't want to give details now, because it's late and it needs a proper write-up, but I will soon. It is fixed now. This post gets updated with a link to a more detailed article when it’s there.

When can one officially put ‘hacker’ in one’s Twitter bio? I think I’m close.

Update: I wrote the post! It's here.

Day 14: XRay like-lookup

I’ve been posting likes to Seblog for a few weeks now. I like the likes, but they al looked like ‘Seb likes this’, so I was like: looks like I like to fix my likes.

Today I hooked up XRay to Seblog, so now my site can see things on URLs I link to. (I actually run a version on a ’secret’ URL, to keep things on my own server.) This way I can grab the name of the post I liked and show that, together with the author. If it has no name, but a photo, it says ‘Seb likes a photo’. It adds the author if it knows who it is. If it really doesn’t know anything about the page, it still defaults to ‘this’. I can fix those manual if I want.

XRay’s format consists of two parts: a data part, with information about the page it looked at, and a refs part, which is a list of URLs that are mentioned or embedded on the page. Retweets, for example, show the original tweet in the refs. I added a refs field to my pages, where I store the data part under the url of the page I mentioned and the refs part under the urls they came with.

An example, with Kirby Data and YAML:

Like-of: http://example.com/a-post

----

Refs:
  "http://example.com/a-post":
    name: A Post
    author: Someone
    repost-of: http://another-example.com/a-photo
  "http://another-example.com/a-photo":
    photo: http://another-example.com/a-photo/img.jpg
    author: Someone Else

I can be more efficient with the moment I grab the data. For example: to send webmentions, my server already reaches out to all links in the post to find their Webmention Endpoint. If I parse the page then, I don’t need another call from XRay later on.

I also need to fiddle a bit with the things I want to display. But for now it’s okay: my server has the data for my likes. I even download pictures if a post has a photo property. I now not only own my likes, I also have my own archive of the things I like.

Day 13: video on Seblog

Today my brother and I made a silly little movie. I used the light on my iPhone to shine through the Apple logo on his MacBook, from the back, while the brightness of his screen was all the way down. All this under a blanked. We had fun and I wanted to share it.

I made a commitment to post everything I post on Instagram or Twitter on my site first, so I used my standard ‘Foto op Seblog’ Workflow to post the video. But it turned out that just adding a video to my photo field didn’t work.

So I did a little tweaking and my site now does support video. In honor of the late Vine I looped it with autoplay. Let’s see how long that lasts.

Day 12: RSVP and Events feed

Busy day today, so no big Indieweb updates. It’s not about the big updates though, it’s about useful updates.

Yesterday, I went to an event. I kind of got into the habit of just getting on my bike and cycling to events, without actually looking up any details at home. Sometimes I don’t even know the exact place I need to go, just that it’s somewhere in the city center. It’s a bad habit, but it’s a habit now.

Since I posted an RSVP to my site for yesterdays event, I was able to find the information I was looking for quickly on my phone. I have been away from Facebook for almost 3 years now and I forgot how useful this feature is.

A combined feed of my RSVPs and Events can now be found at /agenda (Dutch for ‘calendar’), so I can easily find information about where I have to go when I’m cycling and running late next time.

Day 11: a homepage widget

This is a thing I wanted for some time now. Since my weblog is now more a timeline than a blog, long posts disappear fast in the stream of short ones. But since the long posts hold their value longer, I wanted to show them longer.

Also, since I started with this #100daysofindieweb-thingy, I also started to blog in English, which I never did before. My stream of posts gets flooded with English posts.

Today I made a homepage widget. It only shows on the main page, and I can put different parts in it. It now contains a blog list for both Dutch and English, a list of blogposts with #100daysofindieweb, a stream of likes and bookmarks, and the most recent photo post.

This way I can provide a summary of what’s going on on my site. It gives me a framework, where I can always swap parts. The #100daysofindieweb tag, for example, is now a good thing to have, but is less needed 100 days from now.

Days 1 to 10 of #100daysofindieweb

I’ve been doing this #100daysofindieweb thing for 10 days now, so I thought it was time for a summary post. Here are the things I've been working on:

Presentation

I fixed how things look here on my site quite a lot. On Day 1 I fixed how my reposts look. They are not longer just a url, but now actually show the original post, which I stored during the Twitter dump. I don’t do an automated grab of the external page yet!

I also changed now my RSVPs look on Day 7. First they where just a reply post, with a weird textual representation of ‘yes’ or ‘no’, but now they have icons.

On Day 8 I launched a ‘new post type’. In a way, this is just an different presentation of the posts I write for the purpose of writing them. I now only show the word count on those posts, not the text, because the actual text is less important.

Webmentions plugin

On Day 3, I fixed some bugs regarding relative urls, not only in my own Webmentions plugin, but also in the Kirby Toolkit. It eventually led to a new test on Webmention.rocks.

Today I also added an auto-archive function to my plugin, so it triggers archival copies for any url I mention.

Importing posts

I spend Day 5 and Day 6 importing old posts, which was actually quite hard and is still not really finished. I’m quite picky about how things should look, and they don’t look very good at the moment. There are also a lot of duplicate posts, because I imported my Twitter, and I used to link to my blog on Twitter. I still need to fix these things, but since I already spend two days on this, I postpone it some more.

Markup and headers

On Day 2, I marked up my deleted posts. If a post has a dt-deleted in the past, it returns a 410 Gone header on it’s own page, and shows up as a hidden tombstone in the feed.

And on Day 4 I changed the HTTP header of posts with a dt-published in the future to 404 Not found. Together with hiding them from the feed, this makes scheduled posts.

Finally I fiddled a bit with my Microformats on Day 9, so my site comes out better on Indiecards.

All in all it feels like a productive first 10 days! Only 9 of those sets to go :)

Day 10: auto-archive

Yesterday Tantek encouraged people to trigger an archive in the Internet Archive for every url they mention. It is not that hard to do.

So today I added an auto-archive function to my Kirby webmentions plugin, which you can find here. I’m not aware of anyone using it beside myself, but it is set up more or less so that everyone with a Kirby site can use it. (I am planning some breaking changes though.)

I guess, by posting this to my blog, I trigger an archive to a page describing how to trigger an archive. How meta.

Day 9: invisible things and indiecards

Today I did some invisible things. The first does not count for #100daysofindieweb, because it’s too invisible, but the second is valid, I think.

My blog runs on Kirby, which uses .txt files in a folder structure. In order to show you list posts, my server looks at several .txt files. The more posts I post, the more .txt files my server has to open. It all held up quite a while, but since my Twitter import my site contains 8000 pages.

This was not big of a problem for the main page, because I told Kirby to look at the newest folders first, and stop after 20 posts. (They are sorted by date, after all.) But when you want to view only the blogposts, or only posts with a photo, things became slower, because it had to look further in the past. Not to mention what would happen if you ask for a category that does not exist. With 8000 posts, my server would say no.

So I indexed all my posts with a database, a little while ago, and that seemed fine. The only problem was that the database is a pain to maintain. I did add pages to the database when I posted a post via Micropub, but when I edited pages via FTP, I needed to put ?reindex=1 behind the url to trigger a database update for that page. Sometimes you forget that kind of stuff.

Anyway. Today I made a system where entries are cached. I don’t cache the whole page, I just cache the h-entry, the part in this white block. The page now checks wether a .html version of the entry exists, and also checks the timestamps on the .html, the .txt and the entry.php snippet. If the .html turns out to be current, it shows the .html, and if not, it generates a new .html file and shows that new one.

With this new event, I reindex when the .txt has been updated. I still have to visit the page to trigger a reindex, but it’s much more smooth and almost goes without thinking. I’m very happy with how things turned out.

Unfortunately, all of this is not visible, and I haven’t opensourced this part either. And that’s not how things work in this challenge.

So, I needed to do something more visible for #100daysofindieweb. Yesterday I read about Kevin Marks’s Indiecards, and my site turned out weird on them. The main problem was that Indiecards look for the first h-* element on a page. Before today, the first h-* element on my homepage was a h-feed.

I fixed that today: my homepage now first gives a h-card. My h-entry pages still use that ‘same’ h-card as a p-author. Only my feeds now have a u-author set to /, which translates to https://seblog.nl/, which translates to the h-card on my homepage. More people do this, so I should be fine.

als antwoord op keithjgrant.com

About your explanation of Micropub: it's not that it enables you to create Notes, it's that it enables you to use other services to post on your own site. This post, for example, I write in Aaronpk's Quill. Thanks to Micropub, Quill can actually post this reply to my site. But Micropub can be used to create whatever post on my site, as long as the client (Quill) and server (my site) both support that kind of post. See it as an open posting API :)

Oh, and I believe step 3 should be WebSub, which is just PubSub. I'm not at step 3 though.

Anyway, welcome to the Indieweb!

Day 8: the experimental post type ‘wrote’

Yesterday I admitted on IRC that I have quit the #100dagen500woorden part of my challenge. I already stated some doubts but Tantek added me to the wiki anyway.

I felt bad for giving up (so early!) but I felt even worse about posting strange unfinished texts on my site. That’s why I quit: the challenge didn’t feel good.

Today I thought about it some more. I needed a new post-type, where I only post how many words I wrote in a day, not the actual words I wrote.

Compare it with NaNoWriMo, where thousands of people write a novel of 50,000 words in the month of November. The actual text you write in that month does not really matter. The only thing that matters is writing a certain amount of words.

I never participated in NaNoWriMo, so I have no data to export in this format, but I can actually participate this year without having to put my data into NaNoWriMo.org.

Now for the implementation details. I have the following .txt-file on my development server:

This translates to the following public post on my blog:

As you can see, Seblog calculates the amount of words for me, so the only thing I have to do, is write a text and put it under the wrote field. I mark up the number of words with a p-words class, which is probably not consumed by anyone, but hey, I publish this kind of data now.

Now I have no excuse anymore to pick up some writing again.

Last but not least I want to coin the term ‘iceberg post’. A ‘wrote’ in the way I implemented it is an example of an iceberg post: there is a public part and a non-public part to it. It’s just a case of Partial Page Privacy, but I think the term ‘iceberg post’ has a nice ring to it.

Day 7: better RSVP representation

A little while ago I posted my first event, and soon after it came my own RSVP to that event. Back then I wrote a bit of text, because an Indieweb RSVP is a h-entry with a u-in-reply-to and a p-rsvp property. Since it’s a reply, I wrote reply text.

Most people don’t display the text for RSVP posts though, myself included. And today I RSVP’ed on a Meetup.com event, so after the ‘hey you need to post this on your own site first’ bells went off I made a new RSVP on my site. I forgot the text, so it looked like this:

It’s a bit yucky. So since I already have icons here for displaying the different RSVP values (yes, no, maybe and interested), I thought let’s re-use them.

I changed a lot of code behind the scenes too, because my h-entry template became a mess with all the different options it has. (But behind the scenes doesn’t count for #100daysofindieweb!) I now display RSVP’s as what I now call ‘shortposts’:

I also added my bookmarks to the feed at the homepage, because I like to show off my different types of shortposts. (They first where only available at /bookmarks, and I'm actually subscribed to that feed myself, so I can review my own bookmarks in my reader. Nice lifehack.)

Day 6: getting old posts back (part 2)

I got my posts back! :D

Unfortunately the quality of the dump isn't very good. Markdown is sometimes not interpreted, there are double posts (Tweets linking to my blog or Instagram) and a lot of images are at width:500px, which was my default width for over 10 years.

I've been coding some kind of post-match code, too see if I can detect the double posts. It quite hard, because a lot of tweets that are not the same post, are send within 15 minutes of each other, so published is not a good predictor. Trying to calculate the text similarity with levenshtein() seemed like the solution to all my problems, but there are quite a few mismatches still. I probably can't do this fully automated.

I now really have to go to do some real-life stuff, so I guess today just has to be the day I imported the posts and added the images. It's not that I haven't put in the hours today, it's just that it's way too much. (Or I need to be more efficient) You can check things out by scrolling way down in my feed, or at /blog or /tekstbeelden.

Day 5: getting old posts back (part 1)

I have good news and bad news. The good news it that I made a lot of progress in retrieving my old blogposts. The bad news is that I don’t think I can finish this today anymore.

tl;dr: I have my posts in Kirby-formatted .txt-files now, but I need more time to think about moving the images.

In an attempt to own my data, I posted all my Tweets and Instagram-photo’s onto my weblog. To do so, I started fresh, but I never got around to post the old posts back. I have 8000 posts on this site, but the old one’s are all social media reposts, no originals, even though I have had this domain since 2006.

There are roughly four periods between 2006 and now. In the first 5 years, I used a custom made CMS, written by a 16-year-old who taught himself PHP by reading a 175 paged book. (That’s me.) In 2009, I found it too insecure to keep it running that way, so I moved to Wordpress. Looking back at it it wasn’t so insecure at all. Yes, I was 16 years old when I wrote it, but I knew about SQL-injection at the time, and Wordpress isn’t known for it’s security. My own blog at least had security by obscurity.

At the beginning of the Wordpress era I made a fresh start. Luckily I found the old server still running, so my posts should still be there in that database. (You can send a GET request with Host: seblog.nl to my mother’s website to get the last pre-Kirby version.) If not, I have a PDF somewhere, but that’s not great for importing.

After a while I moved away from Wordpress in favor of .txt-files. I did a proper import back then, so that’s where I’m working with now: the .txt-version of my blog.

And those files are great, but hideous. Somehow I thought it was a nice idea to link them together with Javascript, so this version of my blog is not archived by the Web Archive. It worked like an infinite scroll: I had one file called ‘blog.txt’, which only contained a link to the last post. At the end of that post was a link to the next post. I think my 404-page showed a C implementation of the linked list, because that was my inspiration.

Next to ‘blog.txt’ I also had ‘verhalen.txt’ and ‘tekstbeelden.txt’ for other streams of posts. In my import I wrote a nice little recursion to get a list of all the stories, so I can tag them properly:

function findnext($from, $links, &$a) {
  if($links[$from]) {
    $a[] = $links[$from];
    findnext($links[$from], $links, $a);
  }
}

$verhalen = [];
findnext('verhalen', $links, $verhalen);

There is also a lot of Markdown-that-isn’t-Markdown in my posts, which complicates things. Oh, and Wordpress-HTML. I’m trying to regex a lot out of it.

The last problem I have now are images. In Wordpress and my own creation I stored all the images in one folder, but in Kirby the convention is to store them in the folder of the page. (Kirby is one big folder structure. I like that.) I need a way to either upload them when I post the posts via Micropub, or a way to identify the location of the newly created posts from the old slug, so I can move the files on the server. I don’t want to do it by hand.

This blogposts gets hopelessly long. Maybe I could’ve fixed my images-problem by now. But I want a clear head before I start importing. And a backup, but with 8000 posts that takes some time. I’ll resume tomorrow. At least I made progress!

This is a scheduled post. I hope I sleep at this moment.

Day 4: scheduled posts

I'm very late today. It's actually day 5 now in The Netherlands, but since I haven't slept yet I'm still calling this day 4.

Because I started so late, I chose an easy one for today. Unfortunately the easy ones always take longer than you expect. But hey, that makes this episode of #100daysofindieweb an episode!

I am now able to schedule posts on my blog. That's a simple one, because all I have to do is check wether the 'published' field of a post is in the past before I display it. It took a little longer because I had to do it in two places, the feed and the entry page.

The feed now only grabs posts where the 'published' is smaller than the current date, and my router now gives a 404 Not found if a post has a 'published' that is bigger than the current date.

Because my webmentions trigger on the first visit of the page, and the page never gets called, I effectively schedule webmentions with this too. Only one problem: I need to wait for a visitor to open the post once it's published to trigger the mentions. This includes the mention to Brid.gy/publish, to syndicate a post to Twitter. I leave that problem to another day though.

Anyway, it's a simple feature, but a neat one, so I'm happy with it.

Day 3: relative urls

Today I started with fiddling with reply-contexts, but that turned out to be a giant mess, because I store my own posts in a different format than posts by others, and I wanted to re-use my template. I need to come back to that another time.

Yesterday, I mentioned Martijn in my post. And when I mention urls in my posts, my server will try to notify the other site that I mentioned that page. So yesterday my site notified Martijn’s site. At least, it tried.

Martijn is known for giving parsers a hard time. My site came up with http://vanderven.se/mention.php as his webmention endpoint, which is not correct. The following things went wrong:

I mentioned http://vanderven.se/martijn, which is not his canonical url. That’s http://vanderven.se/martijn/, ending in a /.
Kirby Toolkit’s remote::get(), which I used to fetch the page, did follow the redirect, but my script didn’t update the mentioned url accordingly. Since Martijns endpoint link is relative (webmention.php), that / does matter: it defines Martijns page as a folder, so his endpoint is at http://vanderven.se/martijn/endpoint.php
Finally, it turned out that Kirby Toolkit’s url::makeAbsolute($path, $home) did not handle relative urls well. It just intelligently added $path to $home, but didn’t actually solve relative urls where $home is a folder or file.

Martijn encouraged me to write some tests for Kirby Toolkit, because they use phpunit and all that stuff. It was nice, because I didn’t have any experience with this kind of testing (I just refresh the page and hope it works). Learning to test things is a goal for 2017 for me.

When I had a list of good tests, I started to change the function. I might be a little too refresh-frenzy still, but in the end I managed to pass all my own tests. After the pull request and with a little tweaking of my own scripts, I finally got Martijn’s endpoint:

http://vanderven.se/martijn/endpoint.php

So, Martijn, here is another try, still without your canonical /, just to see if it works now. :)

Day 2: 410 Gone

One day I made an RSS-feed just for Martijn, and the other day he said:

[2017-01-09 19:27:07] <Zegnat> Oh, geen support voor /deleted? re: https://seblog.nl/2017/01/08/1/this-is-another-test-post

Nope, I just made a test post and deleted it right away, the hard way: a proper delete. How could I know his parser was just visiting right in the three minutes the posts existed? My site showed a 404 Not found, because the page did not exist.

According to /deleted, that’s not the way things should be. So today I re-posted This is another test post, and then unproperly deleted it.

So when you now go to https://seblog.nl/2017/01/08/1/this-is-another-test-post, you’ll get a 410 Gone, including a Dutch human readable page which explains what that means.

I do this by setting a dt-deleted on my post. I also taught my site that if I put a future date, it won’t 410 on people, until we past that date. This sets up my own Snapchat/Instagram Stories on Seblog!

In addition to a 410 Gone for direct hits on the url, the post does still pop up on feeds, but with the following markup:

<div class="h-entry" style="display:none">
  <a href="https://seblog.nl/2017/01/08/1/this-is-another-test-post" class="u-uid u-url"></a>
  <time datetime="2017-01-08 15:26:26" class="dt-deleted"></time>
</div>

This is not visible for normal users, but advanced feed readers could pick this up and delete the post in their cache. I haven’t added it to the RSS, because I doubt anyone supports this, but h-feed readers can pick it up!

I also don’t support delete via Micropub, but hey, we’re getting somewhere!

Oh and while I was at it: my private posts hide their slug now. My deleted posts don’t, because that would cause a redirect (which is 301 or 302) and I wanted a 410 Gone.

Day 1: fixing reposts

Okay, enough, I’m joining! Aaron has been doing this for ages now (that is, 26 days): 100 days of IndieWeb, in which he builds an IndieWeb related feature into his site or some other service.

I’ve been doing my own 100 days of IndieWeb for a while now, but I never blogged about the outcomes of each day. There also wasn’t much focus. I did multiple things at once and was never really satisfied. So my 100 days of IndieWeb is not about doing more, it’s about doing less, but more consistent.

My updates here won’t all be as spectacular or useful to other people as Aaron’s, but they will be updates. My site will improve a small bit every day.

I wanted to start off with something simple, and since I’m already copying Aaron I’m going to do something he did a few days ago, that is: fixing my reposts.

Before today, they looked like this:

As you can see I default to the hostname when I don’t have an author cached. In the case of a retweet, this is not useful at all! Luckily I already had the data cached from the Twitter import, so with a few tweaks I was able to show that on the page:

To make things more interesting I now have a /reposts feed too.

I'm at the HWC The Netherlands now, demoing Micropub and Brid.gy/publish! #indieweb

Another take on uploading screenshots to a Micropub Media Endpoint

Over the last months I’m on IRC more often. I like the simplicity of sending plain-text messages, but from time to time I like to send a picture as well. The best way to do that on IRC is to upload the file somewhere else and send a link. Uploading files can be a hassle though.

I must admit that this problem is somewhat born because I already found a solution for it elsewhere. I followed Aaron’s recipe for creating a folder that uploads images, but for the times I needed it, I found it tedious to drag my screenshots to that folder. So here’s my alteration of it.

My workflow is nearly the same, but I choose the type ‘Voorziening’ (what’s that in English?), which makes it available in the right-click-menu. I can just select a file, right-click, and go to Voorzieningen > Upload to Media Endpoint.

I let it accept images, but you can go for documents in general as well, and the rest of the workflow is the same as Aaron’s. (Make sure to pass the input to the shell script as arguments!) The only thing is that it will receive a list of files, so I changed the shell script to:

for f in "$@"
do
    curl -i -F "file=@$f" -H "Authorization: Bearer xxx" https://example.com/media-endpoint | grep Location: | sed -En 's/^Location: (.+)/\1/p' | tr -d '\r\n' | pbcopy
done

Note that this uploads multiple files and only saves the last url to the clipboard. I just select one file per upload, so that will be fine.

Meer laden