@cisene I really have to ask where the hell is this delta from everyone else coming from. It would be awesome to see a diff file aka those feeds in Apple Podcasts and those that are not. Google has always claimed high numbers like this but where are they coming from.

@Todd_Blubrry It is discovery through extrapolation and through some pattern matching, pulling some candidate links from the index, attempting to apply a few known pattern and see if it yields any new RSS feeds. Over and over.
The delta is un-announced feeds that nobody had bothered to add to any index, as I see it.

@Todd_Blubrry I've been working mostly on US and Europe, poked at Asia and Africa, middle-east untouched .. so there is probably a good chunk still out there.

@Todd_Blubrry I hope that answers your question. iTunes and Googles indexes are the "popular ones", the less know or non-found are not in the indices.

@cisene @Todd_Blubrry I'm also importing a large batch (500k) of itunes id's from Marco at the same time. It's finding a suprising number of feeds we didn't have. Some were duplicates since Apple's directory sometimes have 2 id's for the same podcast, but many are new to us.

@dave @Todd_Blubrry Through the discovery I'm running I'm constantly throwing away feeds that ends up with duplicate chashes before trickling them to PI .. still there is a good amount of new feeds .. ok, many of them from anchor.fm and spreaker.com but still a fair amount of others ..

@cisene @Todd_Blubrry I'll give some stats from the live DB:

Feeds with itunes id's: 1,639,671

We still have about 300k left to go on Marco's import, so that will put the feeds with itunes id's around the 2 million mark, which fits with their numbers.

@dave @Todd_Blubrry Top 10;
fm.anchor, 1215731
com.spreaker, 770438
com.ivoox, 279730
com.soundcloud, 127955
com.podbean, 60644
com.buzzsprout, 55075
com.libsyn, 54932
fm.castbox, 48790
com.feedburner, 44216
com.podomatic, 41980

@cisene @Todd_Blubrry I should modify our feed dump script to generate stats at the same time it does the weekly dump. Would be very helpful.

@cisene @Todd_Blubrry The one thing we don't have a good way to track are all of the Powerpress podcasts.

@cisene @Todd_Blubrry Here is what I'm showing from the 4/25 database dump:

Anchor: 1,334,439
Spreaker: 770,408
ivoox: 279,635
Soundcloud: 129,164
Buzzsprout: 83,106
Podbean: 71,601
Libsyn: 60,575
Castbox: 49,468
Podomatic: 49,075

Our Libsyn count seems a bit low. I thought they were over 70k from what I've seen in other places.

@cisene @dave The problem with that Blubrry number is most of those that use our services use powerpress on their own .com so very difficult to ever get an accurate blubrry number unless you parse it by feed generated

@Todd_Blubrry @cisene Yes, I want to have a better number for Blubrry. How would you guys suggest we calculate that? Is the <generator> element the best way?

@Todd_Blubrry @cisene Looks like the generator tag just lists Wordpress. I have an idea though. I can run stats on the weekly feed dump and look for newestEnclosureUrl to see if it's blubrry.com...

@Todd_Blubrry @cisene Ok, that yields a count of 17,142. Does that sound closer to accurate Todd?

cc: @mgdell

@dave @cisene @mgdell Closer ;). make sure you are also picking up blubrry.net feeds as well

@dave @Todd_Blubrry @cisene ALso Podcast Mirror feeds (feeds.podcastmirror.com/xxxx )

Sign in to participate in the conversation
PodcastIndex Social

Intended for all stake holders of podcasting who are interested in improving the eco system